Plant amino acid biosynthetic enzymes

ABSTRACT

This invention relates to an isolated nucleic acid fragment encoding a plant cysteine γ synthase. The invention also relates to the construction of a chimeric gene encoding all or a portion of the plant cysteine γ synthase, in sense or antisense orientation, wherein expression of the chimeric gene results in production of altered levels of the plant biosynthetic enzyme in a transformed host cell.

[0001] This application is a continuation-in-part of application Ser.No. 09/424,976 filed on Dec. 2, 1999 which is a national stageapplication of PCT/US98/12073 with an International filing date of Jun.11, 1998, which in turn claims priority benefit of U.S. ProvisionalApplication No. 60/049406, filed Jun. 12, 1997 and U.S. ProvisionalApplication No. 60/065385, filed Nov. 12, 1997.

FIELD OF THE INVENTION

[0002] This invention is in the field of plant molecular biology. Morespecifically, this invention pertains to nucleic acid fragments encodingenzymes involved in amino acid biosynthesis in plants and seeds.

BACKGROUND OF THE INVENTION

[0003] Many vertebrates, including humans, lack the ability tomanufacture a number of amino acids and therefore require these aminoacids in their diet. These are called essential amino acids.Grain-derived foods or feed, however, are deficient in certain essentialamino acids, such as lysine, the sulfur-containing amino acidsmethionine and cysteine, threonine and tryptophan. For example, in corn(Zea mays L.) lysine is the most limiting amino acid for the dietaryrequirements of many animals, and soybean (Glycine max L.) meal is usedas an additive to corn-based animal feeds primarily as a lysinesupplement. Often microbial-fermentation produced lysine is needed forsuch supplementation. Thus, an increase in lysine content of either cornor soybean would reduce or eliminate the need to supplement mixed grainfeeds with lysine produced via fermentation.

[0004] Furthermore, in corn the sulfur amino acids are the third mostlimiting amino acids, after lysine and tryptophan, for the dietaryrequirements of many animals. Legume plants, however, while rich inlysine and tryptophan, have low sulfur-containing amino acid content.Therefore, the use of soybean meal to supplement corn in animal feed isnot satisfactory. An increase in the sulfur amino acid content of eithercorn or soybean would improve the nutritional quality of the mixturesand reduce the need for further supplementation through addition of moreexpensive methionine.

[0005] One approach to increasing the nutritional quality of human foodsand animal feed is to increase the production and accumulation ofspecific free amino acids via genetic engineering of the biosyntheticpathway of the essential amino acids. Biosynthetically, lysine,threonine, methionine, cysteine and isoleucine are all derived fromaspartate. Regulation of the biosynthesis of each member of this familyis interconnected (see FIG. 1). The organization of the pathway leadingto biosynthesis of lysine, threonine, methionine, cysteine andisoleucine indicates that over-expression or reduction of expression ofgenes encoding, inter alia, aspartic semialdehyde dehydrogenase,homoserine kinase, diaminopimelate decarboxylase, cysteine synthase andcystathionine β-lyase in corn and soybean could be used to alter levelsof these amino acids in human food and animal feed. However, few of thegenes encoding enzymes that regulate this pathway in plants, especiallycorn and soybeans, are available. Accordingly, availability of nucleicacid sequences encoding all or a portion of these enzymes wouldfacilitate development of nutritionally improved crop plants.

SUMMARY OF THE INVENTION

[0006] The present invention relates to isolated polynucleotidesselected from the group consisting of SEQ ID NOs:1, 3, 5, 42, 44, 46,48, 50, 8, 10, 12, 14, 16, 18, 53, 55, 21, 23, 25, 27, 58, 30, 61, 63,33, 35, 37, 39, 67, 69, and 71.

[0007] The present invention concerns isolated polynucleotidescomprising a nucleotide sequence selected from the group consisting of:(a) a nucleotide sequence encoding a polypeptide of at least 60 aminoacids having at least 80% identity based on the Clustal method ofalignment when compared to a polypeptide selected from the groupconsisting of SEQ ID NOs:2, 4, 6, 43, 45, 47, 49, and 51; (b) anucleotide sequence encoding a polypeptide of at least 60 amino acidshaving at least 95% identity based on the Clustal method of alignmentwhen compared to a polypeptide selected from the group consisting of SEQID NOs:9, 11, 13, 15, 17, 19, 54 and 56; (c) a nucleotide sequenceencoding a polypeptide of at least 60 amino acids having at least 80%identity based on the Clustal method of alignment when compared to apolypeptide selected from the group consisting of SEQ ID NOs:22, 24, 26,28, and 59; (d) a nucleotide sequence encoding a polypeptide of at least60 amino acids having at least 95% identity based on the Clustal methodof alignment when compared to a polypeptide selected from the groupconsisting of SEQ ID NOs:31, 62, and 64; and (e) a nucleotide sequenceencoding a polypeptide of at least 60 amino acids having at least 85%identity based on the Clustal method of alignment when compared to apolypeptide selected from the group consisting of SEQ ID NOs:34, 36, 38,40, 68, 70, and 72. It is preferred that the identity be at least 85%,more preferably at least 90%, still more preferably at least 95%. Thisinvention also relates to the isolated complement of suchpolynucleotides, wherein the complement and the polynucleotide consistof the same number of nucleotides, and the nucleotide sequences of thecomplement and the polynucleotide have 100% complementarity.

[0008] In a third embodiment nucleotide sequence of the isolated firstpolynucleotide is selected from SEQ ID NOs:1, 3, 5, 42, 44, 46, 48, 50,SEQ ID NOs:8, 10, 12, 14, 16, 18, 53 and 55, SEQ ID NOs:21, 23, 25, 27,and 58, SEQ ID NOs:30, 61, and 63, and SEQ ID NOs:33, 35, 37, 39, 67,69, and 71.

[0009] In a fourth embodiment, this invention concerns an isolatedpolynucleotide encoding an aspartic semialdehyde dehydrogenase, adiaminopimelate decarboxylase, a homoserine kinase, a cysteine γsynthase or a cystathionine β-lyase.

[0010] In a fifth embodiment, this invention relates to a chimeric genecomprising the polynucleotide of the present invention.

[0011] In a sixth embodiment, the present invention concerns an isolatednucleic acid molecule that comprises at least 180 nucleotides andremains hybridized with the isolated polynucleotide of the presentinvention under a wash condition of 0.1×SSC, 0.1% SDS, and 65° C.

[0012] In a seventh embodiment, the invention also relates to a hostcell comprising a chimeric gene of the present invention or an isolatedpolynucleotide of the present invention. The host cell may beeukaryotic, such as a yeast cell or a plant cell, or prokaryotic, suchas a bacterial cell. The present invention may also relate to a viruscomprising an isolated polynucleotide of the present invention or achimeric gene of the present invention.

[0013] In an eighth embodiment, the invention concerns a transgenicplant comprising a polynucleotide of the present invention.

[0014] In a ninth embodiment, the invention relates to a method fortransforming a cell by introducing into such cell the polynucleotide ofthe present invention, or a method of producing a transgenic plant bytransforming a plant cell with the polynucleotide of the presentinvention and regenerating a plant from the transformed plant cell.

[0015] In a tenth embodiment, the invention concerns a method forproducing a nucleotide fragment by selecting a nucleotide sequencecomprised by a polynucleotide of the present invention and synthesizinga polynucleotide fragment containing the nucleotide sequence. It isunderstood that the nucleotide fragment may be produced in vitro or invivo.

[0016] In an eleventh embodiment the invention concerns an isolatedpolypeptide comprising an amino acid sequence selected from the groupconsisting of: (a) a polypeptide of at least 60 amino acids and having asequence identity of at least 80% based on the Clustal method ofalignment when compared to an amino acid sequence selected from thegroup consisting of SEQ ID NOs:2, 4, 6, 43, 45, 47, 49, and 51; (b) apolypeptide of at least 60 amino acids having a sequence identity of atleast 95% based on the Clustal method of alignment when compared to anamino acid sequence selected from the group consisting of SEQ ID NOs:9,11, 13, 15, 17, 19, 54 and 56; (c) a polypeptide of at least 60 aminoacids having a sequence identity of at least 80% based on the Clustalmethod of alignment when compared to an amino acid sequence selectedfrom the group consisting of SEQ ID NOs:22, 24, 26, 28, and 59; (d)polypeptide of at least 60 amino acids having an identity of at least95% based on the Clustal method of alignment when compared to an aminoacid sequence selected from the group consisting of SEQ ID NOs:31, 62,and 64; and (e) a polypeptide of at least 60 amino acids having asequence identity of at least 85% based on the Clustal method ofalignment when compared to an amino acid sequence selected from thegroup consisting of SEQ ID NOs:34, 36, 38, 40, 68, 70, and 72. It ispreferred that the identity be at least 85%, it is more preferred if theidentity is at least 90%, it is preferable that the identity be at least95%.

[0017] In a twelfth embodiment the invention relates to an isolatedpolypleptide selected from SEQ ID NOs:2, 4, 6, 43, 45, 47, 49, and 51,SEQ ID NOs:9, 11, 13, 15, 17, 19, 54 and 56, SEQ ID NOs:22, 24, 26, 28,and 59, SEQ ID NOs:31, 62, and 64, and SEQ ID NOs:34, 36, 38, 40, 68,70, and 72.

[0018] In a thirteenth embodiment, this invention concerns an isolatedpolypeptide having aspartic semialdehyde dehydrogenase, diaminopimelatedecarboxylase, homoserine kinase, cysteine γ synthase, or cystathionineβ-lyase function.

[0019] In a fourteenth embodiment, this invention relates to a method ofaltering the level of expression of a plant biosynthetic enzyme in ahost cell comprising: transforming a host cell with a chimeric gene ofthe present invention; and growing the transformed host cell underconditions that are suitable for expression of the chimeric gene.

[0020] A further embodiment of the instant invention is a method forevaluating a compound for its ability to inhibit the activity of a plantbiosynthetic enzyme selected from the group consisting of asparticsemialdehyde dehydrogenase, diaminopimelate decarboxylase, homoserinekinase, cysteine γ synthase and cystathionine β-lyase, the methodcomprising the steps of: (a) transforming a host cell with a chimericgene comprising a nucleic acid fragment encoding a plant biosyntheticenzyme selected from the group consisting of aspartic semialdehydedehydrogenase, diaminopimelate decarboxylase, homoserine kinase,cysteine synthase and cystathionine β-lyase, operably linked toregulatory sequences; (b) growing the transformed host cell underconditions that are suitable for expression of the chimeric gene whereinexpression of the chimeric gene results in production of thebiosynthetic enzyme in the transformed host cell; (c) optionallypurifying the biosynthetic enzyme expressed by the transformed hostcell; (d) treating the biosynthetic enzyme with a compound to be tested;and (e) comparing the activity of the biosynthetic enzyme that has beentreated with a test compound to the activity of an untreatedbiosynthetic enzyme.

BRIEF DESCRIPTION OF THE DRAWINGS AND SEQUENCE LISTINGS

[0021] The invention can be more fully understood from the followingdetailed description and the accompanying drawings and Sequence Listingwhich form a part of this application.

[0022]FIG. 1 depicts the biosynthetic pathway for the aspartate familyof amino acids. The following abbreviations are used: AK=aspartokinase;ASADH=aspartic semialdehyde dehydrogenase; DHDPS=dihydrodipicolinatesynthase; DHDPR=dihydrodipicolinate reductase; DAPEP=diaminopimelateepimerase; DAPDC=diaminopimelate decarboxylase; HDH=homoserinedehydrogenase; HK=homoserine kinase; TS=threonine synthase; TD=threoninedeaminase; CγS=cystathionine γ-synthase; CβL=cystathionine β-lyase;MS=methionine synthase; CS=cysteine synthase; andSAMS=S-adenosylmethionine synthase.

[0023]FIGS. 2 through 6 show the amino acid sequence alignments betweenthe known art sequences for aspartic semialdehyde dehydrogenase,diaminopimelate decarboxylase, homoserine kinase, cysteine γ synthase,and cystathione β-lyase with the sequences included in this application.Alignments were performed using the Clustal alogarithm described inHiggins and Sharp (1989) (CABIOS 5:151-153). Amino acids conserved amongall sequences are indicated by an asterisk (*) above the alignment.Dashes are used by the program to maximize the alignment. A descriptionof FIGS. 2 through 6 follows:

[0024]FIG. 2 shows a comparison of the aspartic semialdehydedehydrogenase amino acid sequences from corn contig assembled fromclones p0003.cgpha22r:fis, cpe1c.pk009.b24, p0016.ctscp83r, andp0075.cslab16r (SEQ ID NO:43), rice clone rlr48.pk0003.d12 (SEQ IDNO:2), the contig of 5′ RACE PCR and rice clone rlr48.pk0003.d12 (SEQ IDNO:45), soybean clones sf11.pk0122.f9 (SEQ ID NO:6), ses9c.pk001.a15:fis(SEQ ID NO:47), and sf11.pk0122.f9:fis (SEQ ID NO;49), wheat cloneswr1.pk0004.c11 (SEQ ID NO:4) and wdk1c.pk014.n5:fis (SEQ ID NO:51) withthe Legionella pneumophila (NCBI General Identifier No. 2645882; SEQ IDNO:7) and the Aquifex aeolicus sequences (NCBI General Identifier No.6225258; SEQ ID NO:52). FIG. 2A: positions 1 through 120; FIG. 2B:positions 121 through 240; FIG. 2C: positions 241 through 360; FIG. 2D:positions 361 through 392.

[0025]FIG. 3 shows a comparison of the diaminopimelate decarboxylaseamino acid sequences derived from corn clones cen3n.pk0067.a3 (SEQ IDNO:9) and cr1n.pk0103.d8 (SEQ ID NO:11), rice clone r10n.pk0013.b9 (SEQID NO:13), soybean clones sr1.pk0132.c1 (SEQ ID NO:15), sdp3c.pk001.o15(SEQ ID NO:19) and sdp3c.pk000.o15:fis (SEQ ID NO:54), wheat cloneswlk1.pk0012.c2 (SEQ ID NO:17) and wlk1.pk0012.c2:fis (SEQ ID NO:56) withthe Pseudomonas aeruginosa (NCBI General Identifier No. 118304; SEQ IDNO:20) and Arabidopsis thaliana sequences (NCBI General Identifier No.9279586; SEQ ID NO:57). FIG. 3A: positions 1 through 120; FIG. 3B:positions 121 through 240; FIG. 3C: positions 241 through 360; FIG. 3D:positions 361 through 480; FIG. 3E: positions 481 through 535.

[0026]FIG. 4 shows a comparison of the homoserine kinase amino acidsequences derived from corn clone cr1n.pk0009.g4 (SEQ ID NO:22), riceclones rcalc.pk005.k3 (SEQ ID NO:24) and rca1c.pk005.k3:fis (SEQ IDNO:59), soybean clone ses8w.pk0020.b5 (SEQ ID NO:26), wheat clonewl1n.pk0065.f2 (SEQ ID NO:28) with the Methanococcus jannaschii (NCBIGeneral Identifier No. 1591748; SEQ ID NO:29) and the Arabidopsisthaliana sequences (NCBI General Identifier No. 4927412; SEQ ID NO:60).FIG. 4A: positions 1 through 180; FIG. 4B: positions 181 though 360;FIG. 4C: positions 361 through 396.

[0027]FIG. 5 shows a comparison of the cysteine γ synthase amino acidsequences derived from the corn contig assembled from clones ccoln.pk083 j4, chp2.pk0016.b1, cpd1c.pk004.b20, cr1n.pk0083.c5,csi1.pk0003.g6, and p0126.cnlcb49r (SEQ ID NO:62), rice clonerls6.pkO068.b7:fis (SEQ ID NO:64), soybean clone se3.05h06 (SEQ IDNO:31) with the Citrullus lanatus sequence (NCBI General Identifier No.540497; SEQ ID NO:32), the Spinacia oleracea sequence (NCBI GeneralIdentifier No. 540497; SEQ ID NO:65), and the Solanum tuberosum sequence(NCBI General Identifier No. 11131628; SEQ ID NO:66).

[0028]FIG. 5A: postions 1 through 180; FIG. 5B: postions 181 through360; FIG. 5C: positions 361 through 424.

[0029]FIG. 6 shows a comparison of the amino acid sequences of thecystathionine β-lyase derived from corn clone cen1.pk0061.d4 (SEQ IDNO:34), corn contig assembled from clones p0005.cbmei71r,p0014.ctuui39r, p0109.cdadg47r, and p0125.czaay16r (SEQ ID NO:68), riceclone rlr12.pk0026.g1 (SEQ ID NO:36), the contig of 5′ PCR and riceclone rlr12.pk0026.g1:fis (SEQ ID NO:70), soybean clone sf11.pk0012.c4(SEQ ID NO:38), and wheat clones wr1.pk0091.g6 (SEQ ID NO:40) andwr1.pk0091.g6:fis (SEQ ID NO:72) with the Arabidopsis thaliana sequence(NCBI General Identifier No. 1708993; SEQ ID NO:41). FIG. 6A: positions1 through 120; FIG. 6B: positions 121 through 240; FIG. 6C: postions 241through 360; FIG. 6D: positions 361 through 483.

[0030] Table 1 lists the polypeptides that are described herein, thedesignation of the cDNA clones that comprise the nucleic acid fragmentsencoding polypeptides representing all or a substantial portion of thesepolypeptides, and the corresponding identifier (SEQ ID NO:) as used inthe attached Sequence Listing. The sequence descriptions and SequenceListing attached hereto comply with the rules governing nucleotideand/or amino acid sequence disclosures in patent applications as setforth in 37 C.F.R. §1.821-1.825. TABLE 1 Plant Biosynthetic Enzymes SEQID NO: Polypeptide Clone (Nucleotide) (Amino Acid) rice ASADHrlr48.pk0003.d12 1 2 wheat ASADH wr1.pk0004.c11 3 4 soybean ASADHsfl1.pk0122.f9 5 6 L. pneumophila NCBI GI 2645882 7 ASADH corn DAPEPcen3n.pk0067.a3 8 9 comDAPEP cr1n.pk0103.d8 10 11 rice DAPEPr10n.pk0013.b9 12 13 soybean DAPEP sr1.pk0132.c1 14 15 wheat DAPEPwlk1.pk0012.c2 16 17 soybean DAPEP sdp3c.pk001.o15 18 19 P. aeruginosaNCBI GI 118304 20 DAPEP corn HK cr1n.pk0009.g4 21 22 rice HKrca1c.pk005.k3 23 24 soybean HK ses8w.pk0020.b5 25 26 wheat HKwl1n.pk0065.f2 27 28 M jannaschii HK NCBI GI 1591748 29 soybean CγSse3.05h06 30 31 C. lanatus CγS NCBI GI 540497 32 corn CβL cen1.pk0061.d433 34 rice CβL rlr12.pk0026.g1 35 36 soybean CβL sfl1.pk0012.c4 37 38wheat CβL wr1.pk0091.g6 39 40 A. thaliana CβL NCBI GI 1708993 41 cornASADH Contig of: 42 43 p0003.cgpha22r:fis cpe1c.pk009.b24 p0016.ctscp83rp0075.cslab16r rice ASADH 5′ RACE PCR + 44 45 r1r48.pk0003.d12 soybeanASADH ses9c.pk001.a15:fis 46 47 soybean ASADH sfl1.pk0122.f9:fis 48 49wheat ASADH wdk1c.pk014.n5:fis 50 51 A. aeolicus ASADH NCBI GI 622525852 soybean DAPEP sdp3c.pk001.o15:fis 53 54 wheat DAPEPwlk1.pk0012.c2:fis 55 56 A. thaliana DAPEP NCBI GI 9279586 57 rice HKrca1c.pk005.k3:fis 58 59 A. thaliana HK NCBI GI 4927412 60 corn cγsContig of: 61 62 cco1n.pk083.j4 chp2.pk0016.b1 cpd1c.pk004.b20cr1n.pk0083.c5 csi1.pk0003.g6 p0126.cnlcb49r rice CγS rls6.pk0068.b7:fis63 64 S. oleracea CγS NCBI GI 416869 65 S. tuberosum CγS NCBI GI11131628 66 corn CβL Contig of: 67 68 p0005.cbmei71r p0014.ctuui39rp0109.cdadg47r p0125.czaay16r rice CβL 5′RACE PCR + 69 70rlr12.pk0026.g1:fis wheat CβL wr1.pk0091.g6:fis 71 72

[0031] The nucleotide and amino acid sequences shown in SEQ ID NOs: 1through 41 are found, with the same SEQ ID NO, in U.S. application Ser.No. 09/424,976. All or a portion of some of the sequences in the presentapplication are found in the provisional applications for which thepresent application claims priority to. Table 1A indicates the SEQ IDNO: in the present application and the corresponding SEQ ID NO: in thepreviously-filed provisional application. TABLE 1A Sequence PriorityApplication Provisional Application Provisional Application No.09/424,976 No. 60/049406 No. 60/065385 SEQ ID NO:1 SEQ ID NO:1 SEQ IDNO:2 SEQ ID NO:2 SEQ ID NO:3 SEQ ID NO:3* SEQ ID NO:4 SEQ ID NO:4* SEQID NO:8 SEQ ID NO:7 SEQ ID NO:8 SEQ ID NO:9 SEQ ID NO:8 SEQ ID NO:9 SEQID NO:12 SEQ ID NO:9 SEQ ID NO:13 SEQ ID NO:10 SEQ ID NO:14 SEQ ID NO:11SEQ ID NO:5 SEQ ID NO:15 SEQ ID NO:12 SEQ ID NO:6 SEQ ID NO:21 SEQ IDNO:13 SEQ ID NO:10* SEQ ID NO:22 SEQ ID NO:14 SEQ ID NOs:11* and 14* SEQID NO:23 SEQ ID NO:17* SEQ ID NO:15 SEQ ID NO:24 SEQ ID NO:18* SEQ IDNO:16 SEQ ID NO:25 SEQ ID NO:15 SEQ ID NO:13 SEQ ID NO:26 SEQ ID NO:16SEQ ID NO:14 SEQ ID NO:30 SEQ ID NO:19 SEQ ID NO:17 SEQ ID NO:31 SEQ IDNO:20 SEQ ID NO:18 SEQ ID NO:33* SEQ ID NO:21 SEQ ID NO:19 SEQ ID NO:34SEQ ID NO:22 SEQ ID NO:20 SEQ ID NO:37 SEQ ID NO:23 SEQ ID NO:21* SEQ IDNO:38 SEQ ID NO:24 SEQ ID NO:22*

[0032] The Sequence Listing contains the one letter code for nucleotidesequence characters and the three letter codes for amino acids asdefined in conformity with the IUPAC-IUBMB standards described inNucleic Acids Res. 13:3021-3030 (1985) and in the Biochemical J. 219(No. 2):345-373 (1984) which are herein incorporated by reference. Thesymbols and format used for nucleotide and amino acid sequence datacomply with the rules set forth in 37 C.F.R. §1.822.

DETAILED DESCRIPTION OF THE INVENTION

[0033] In the context of this disclosure, a number of terms shall beutilized. The terms “polynucleotide,” “polynucleotide sequence,”“nucleic acid sequence,” and “nucleic acid fragment”/“isolated nucleicacid fragment” are used interchangeably herein. These terms encompassnucleotide sequences and the like. A polynucleotide may be a polymer ofRNA or DNA that is single- or double-stranded, that optionally containssynthetic, non-natural or altered nucleotide bases. A polynucleotide inthe form of a polymer of DNA may be comprised of one or more segments ofcDNA, genomic DNA, synthetic DNA, or mixtures thereof. An isolatedpolynucleotide of the present invention may include at least 30contiguous nucleotides, preferably at least 40 contiguous nucleotides,most preferably at least 60 contiguous nucleotides derived from SEQ IDNOs:1, 3, 5, 42, 44, 46, 48, 50, SEQ ID NOs:8, 10, 12, 14, 16, 18, 53and 55, SEQ ID NOs:21, 23, 25, 27, and 58, SEQ ID NOs:30, 61, and 63,and SEQ ID NOs:33, 35, 37, 39, 67, 69, and 71, or the complement of suchsequences.

[0034] The term “isolated” polynucleotide refers to a polynucleotidethat is substantially free from other nucleic acid sequences with whichit is normally associated such as other chromosomal and extrachromosomalDNA and RNA. Isolated polynucleotides may be purified from a host cellin which they naturally occur. Conventional nucleic acid purificationmethods known to skilled artisans may be used to obtain isolatedpolynucleotides. The term also embraces recombinant polynucleotides andchemically synthesized polynucleotides.

[0035] The term “recombinant” means, for example, that a nucleic acidsequence is made by an artificial combination of two otherwise separatedsegments of sequence, e.g., by chemical synthesis or by the manipulationof isolated nucleic acids by genetic engineering techniques.

[0036] As used herein, “contig” refers to a nucleotide sequence that isassembled from two or more constituent nucleotide sequences that sharecommon or overlapping regions of sequence homology. For example, thenucleotide sequences of two or more nucleic acid fragments can becompared and aligned in order to identify common or overlappingsequences. Where common or overlapping sequences exist between two ormore nucleic acid fragments, the sequences (and thus their correspondingnucleic acid fragments) can be assembled into a single contiguousnucleotide sequence.

[0037] As used herein, “substantially similar” refers to nucleic acidfragments wherein changes in one or more nucleotide bases results insubstitution of one or more amino acids, but do not affect thefunctional properties of the polypeptide encoded by the nucleotidesequence. “Substantially similar” also refers to nucleic acid fragmentswherein changes in one or more nucleotide bases does not affect theability of the nucleic acid fragment to mediate alteration of geneexpression by gene silencing through for example antisense orco-suppression technology. “Substantially similar” also refers tomodifications of the nucleic acid fragments of the instant inventionsuch as deletion or insertion of one or more nucleotides that do notsubstantially affect the functional properties of the resultingtranscript vis-à-vis the ability to mediate gene silencing or alterationof the functional properties of the resulting protein molecule. It istherefore understood that the invention encompasses more than thespecific exemplary nucleotide or amino acid sequences and includesfunctional equivalents thereof. The terms “substantially similar” and“corresponding substantially” are used interchangeably herein.

[0038] Substantially similar nucleic acid fragments may be selected byscreening nucleic acid fragments representing subfragments ormodifications of the nucleic acid fragments of the instant invention,wherein one or more nucleotides are substituted, deleted and/orinserted, for their ability to affect the level of the polypeptideencoded by the unmodified nucleic acid fragment in a plant or plantcell. For example, a substantially similar nucleic acid fragmentrepresenting at least 30 contiguous nucleotides derived from the instantnucleic acid fragment can be constructed and introduced into a plant orplant cell. The level of the polypeptide encoded by the unmodifiednucleic acid fragment present in a plant or plant cell exposed to thesubstantially similar nucleic fragment can then be compared to the levelof the polypeptide in a plant or plant cell that is not exposed to thesubstantially similar nucleic acid fragment.

[0039] For example, it is well known in the art that antisensesuppression and co-suppression of gene expression may be accomplishedusing nucleic acid fragments representing less than the entire codingregion of a gene, and by using nucleic acid fragments that do not share100% sequence identity with the gene to be suppressed. Moreover,alterations in a nucleic acid fragment which result in the production ofa chemically equivalent amino acid at a given site, but do not affectthe functional properties of the encoded polypeptide, are well known inthe art. Thus, a codon for the amino acid alanine, a hydrophobic aminoacid, may be substituted by a codon encoding another less hydrophobicresidue, such as glycine, or a more hydrophobic residue, such as valine,leucine, or isoleucine. Similarly, changes which result in substitutionof one negatively charged residue for another, such as aspartic acid forglutamic acid, or one positively charged residue for another, such aslysine for arginine, can also be expected to produce a functionallyequivalent product. Nucleotide changes which result in alteration of theN-terminal and C-terminal portions of the polypeptide molecule wouldalso not be expected to alter the activity of the polypeptide. Each ofthe proposed modifications is well within the routine skill in the art,as is determination of retention of biological activity of the encodedproducts. Consequently, an isolated polynucleotide comprising anucleotide sequence of at least 30 (preferably at least 40, mostpreferably at least 60) contiguous nucleotides derived from a nucleotidesequence selected from the group consisting of SEQ ID NOs:1, 3, 5, 42,44, 46, 48, 50, SEQ ID NOs:8, 10, 12, 14, 16, 18, 53 and 55, SEQ IDNOs:21, 23, 25, 27, and 58, SEQ ID NOs:30, 61, and 63, and SEQ IDNOs:33, 35, 37, 39, 67, 69, and 71 and the complement of such nucleotidesequences may be used in methods of selecting an isolated polynucleotidethat affects the expression of an aspartic-semialdehyde dehydrogenase, adiaminopimelate decarboxylase, a homoserine kinase, a cysteine γsynthase, or a cystathionine β-lyase polypeptide in a host cell. Amethod of selecting an isolated polynucleotide that affects the level ofexpression of a polypeptide in a host cell may comprise the steps of:constructing an isolated polynucleotide of the present invention or anisolated chimeric gene of the present invention; introducing theisolated polynucleotide or the isolated chimeric gene into a host cell;measuring the level of a polypeptide or enzyme activity in the host cellcontaining the isolated polynucleotide; and comparing the level of apolypeptide or enzyme activity in the host cell containing the isolatedpolynucleotide with the level of a polypeptide or enzyme activity in ahost cell that does not contain the isolated polynucleotide.

[0040] Moreover, substantially similar nucleic acid fragments may alsobe characterized by their ability to hybridize. Estimates of suchhomology are provided by either DNA-DNA or DNA-RNA hybridization underconditions of stringency as is well understood by those skilled in theart (Hames and Higgins, Eds. (1985) Nucleic Acid Hybridisation, IRLPress, Oxford, U.K.). Stringency conditions can be adjusted to screenfor moderately similar fragments, such as homologous sequences fromdistantly related organisms, to highly similar fragments, such as genesthat duplicate functional enzymes from closely related organisms.Post-hybridization washes determine stringency conditions. One set ofpreferred conditions uses a series of washes starting with 6×SSC, 0.5%SDS at room temperature for 15 min, then repeated with 2×SSC, 0.5% SDSat 45° C. for 30 min, and then repeated twice with 0.2×SSC, 0.5% SDS at50° C. for 30 min. A more preferred set of stringent conditions useshigher temperatures in which the washes are identical to those aboveexcept for the temperature of the final two 30 min washes in 0.2×SSC,0.5% SDS was increased to 60° C. Another preferred set of highlystringent conditions uses two final washes in 0.1×SSC, 0.1% SDS at 65°C.

[0041] Substantially similar nucleic acid fragments of the instantinvention may also be characterized by the percent identity of the aminoacid sequences that they encode to the amino acid sequences disclosedherein, as determined by algorithms commonly employed by those skilledin this art. Suitable nucleic acid fragments (isolated polynucleotidesof the present invention) encode polypeptides that are at least about70% identical, preferably at least about 80% identical to the amino acidsequences reported herein. Preferred nucleic acid fragments encode aminoacid sequences that are about 85% identical to the amino acid sequencesreported herein. More preferred nucleic acid fragments encode amino acidsequences that are at least about 90% identical to the amino acidsequences reported herein. Most preferred are nucleic acid fragmentsthat encode amino acid sequences that are at least about 95% identicalto the amino acid sequences reported herein. Suitable nucleic acidfragments not only have the above identities but typically encode apolypeptide having at least 50 amino acids, preferably at least 100amino acids, more preferably at least 150 amino acids, still morepreferably at least 200 amino acids, and most preferably at least 250amino acids. Sequence alignments and percent identity calculations wereperformed using the Megalign program of the LASERGENE bioinformaticscomputing suite (DNASTAR Inc., Madison, Wis.). Multiple alignment of thesequences was performed using the Clustal method of alignment (Higginsand Sharp (1989) CABIOS. 5:151-153) with the default parameters (GAPPENALTY=10, GAP LENGTH PENALTY=10). Default parameters for pairwisealignments using the Clustal method were KTUPLE 1, GAP PENALTY=3,WINDOW=5 and DIAGONALS SAVED=5.

[0042] A “substantial portion” of an amino acid or nucleotide sequencecomprises an amino acid or a nucleotide sequence that is sufficient toafford putative identification of the protein or gene that the aminoacid or nucleotide sequence comprises. Amino acid and nucleotidesequences can be evaluated either manually, by one skilled in the art,or by using computer-based sequence comparison and identification toolsthat employ algorithms such as BLAST (Basic Local Alignment Search Tool;Altschul et al. (1993) J. Mol. Biol. 215:403-410; see alsowww.ncbi.nlm.nih.gov/BLAST/). In general, a sequence of ten or morecontiguous amino acids or thirty or more contiguous nucleotides isnecessary in order to putatively identify a polypeptide or nucleic acidsequence as homologous to a known protein or gene. Moreover, withrespect to nucleotide sequences, gene-specific oligonucleotide probescomprising 30 or more contiguous nucleotides may be used insequence-dependent methods of gene identification (e.g., Southernhybridization) and isolation (e.g., in situ hybridization of bacterialcolonies or bacteriophage plaques). In addition, short oligonucleotidesof 12 or more nucleotides may be used as amplification primers in PCR inorder to obtain a particular nucleic acid fragment comprising theprimers. Accordingly, a “substantial portion” of a nucleotide sequencecomprises a nucleotide sequence that will afford specific identificationand/or isolation of a nucleic acid fragment comprising the sequence. Theinstant specification teaches amino acid and nucleotide sequencesencoding polypeptides that comprise one or more particular plantproteins. The skilled artisan, having the benefit of the sequences asreported herein, may now use all or a substantial portion of thedisclosed sequences for purposes known to those skilled in this art.Accordingly, the instant invention comprises the complete sequences asreported in the accompanying Sequence Listing, as well as substantialportions of those sequences as defined above.

[0043] “Codon degeneracy” refers to divergence in the genetic codepermitting variation of the nucleotide sequence without affecting theamino acid sequence of an encoded polypeptide. Accordingly, the instantinvention relates to any nucleic acid fragment comprising a nucleotidesequence that encodes all or a substantial portion of the amino acidsequences set forth herein. The skilled artisan is well aware of the“codon-bias” exhibited by a specific host cell in usage of nucleotidecodons to specify a given amino acid. Therefore, when synthesizing anucleic acid fragment for improved expression in a host cell, it isdesirable to design the nucleic acid fragment such that its frequency ofcodon usage approaches the frequency of preferred codon usage of thehost cell.

[0044] “Synthetic nucleic acid fragments” can be assembled fromoligonucleotide building blocks that are chemically synthesized usingprocedures known to those skilled in the art. These building blocks areligated and annealed to form larger nucleic acid fragments which maythen be enzymatically assembled to construct the entire desired nucleicacid fragment. “Chemically synthesized”, as related to a nucleic acidfragment, means that the component nucleotides were assembled in vitro.Manual chemical synthesis of nucleic acid fragments may be accomplishedusing well established procedures, or automated chemical synthesis canbe performed using one of a number of commercially available machines.Accordingly, the nucleic acid fragments can be tailored for optimal geneexpression based on optimization of the nucleotide sequence to reflectthe codon bias of the host cell. The skilled artisan appreciates thelikelihood of successful gene expression if codon usage is biasedtowards those codons favored by the host. Determination of preferredcodons can be based on a survey of genes derived from the host cellwhere sequence information is available.

[0045] “Gene” refers to a nucleic acid fragment that expresses aspecific protein, including regulatory sequences preceding (5′non-coding sequences) and following (3′ non-coding sequences) the codingsequence. “Native gene” refers to a gene as found in nature with its ownregulatory sequences. “Chimeric gene” refers any gene that is not anative gene, comprising regulatory and coding sequences that are notfound together in nature. Accordingly, a chimeric gene may compriseregulatory sequences and coding sequences that are derived fromdifferent sources, or regulatory sequences and coding sequences derivedfrom the same source, but arranged in a manner different than that foundin nature. “Endogenous gene” refers to a native gene in its naturallocation in the genome of an organism. A “foreign-gene” refers to a genenot normally found in the host organism, but that is introduced into thehost organism by gene transfer. Foreign genes can comprise native genesinserted into a non-native organism, or chimeric genes. A “transgene” isa gene that has been introduced into the genome by a transformationprocedure.

[0046] “Coding sequence” refers to a nucleotide sequence that codes fora specific amino acid sequence. “Regulatory sequences” refer tonucleotide sequences located upstream (5′ non-coding sequences), within,or downstream (3′ non-coding sequences) of a coding sequence, and whichinfluence the transcription, RNA processing or stability, or translationof the associated coding sequence. Regulatory sequences may includepromoters, translation leader sequences, introns, and polyadenylationrecognition sequences.

[0047] “Promoter” refers to a nucleotide sequence capable of controllingthe expression of a coding sequence or functional RNA. In general, acoding sequence is located 3′ to a promoter sequence. The promotersequence consists of proximal and more distal upstream elements, thelatter elements often referred to as enhancers. Accordingly, an“enhancer” is a nucleotide sequence which can stimulate promoteractivity and may be an innate element of the promoter or a heterologouselement inserted to enhance the level or tissue-specificity of apromoter. Promoters may be derived in their entirety from a native gene,or may be composed of different elements derived from differentpromoters found in nature, or may even comprise synthetic nucleotidesegments. It is understood by those skilled in the art that differentpromoters may direct the expression of a gene in different tissues orcell types, or at different stages of development, or in response todifferent environmental conditions. Promoters which cause a nucleic acidfragment to be expressed in most cell types at most times are commonlyreferred to as “constitutive promoters”. New promoters of various typesuseful in plant cells are constantly being discovered; numerous examplesmay be found in the compilation by Okamuro and Goldberg (1989)Biochemistry of Plants 15:1-82. It is further recognized that since inmost cases the exact boundaries of regulatory sequences have not beencompletely defined, nucleic acid fragments of different lengths may haveidentical promoter activity.

[0048] “Translation leader sequence” refers to a nucleotide sequencelocated between the promoter sequence of a gene and the coding sequence.The translation leader sequence is present in the fully processed mRNAupstream of the translation start sequence. The translation leadersequence may affect processing of the primary transcript to mRNA, mRNAstability or translation efficiency. Examples of translation leadersequences have been described (Turner and Foster (1995) Mol. Biotechnol.3:225-236).

[0049] “3′ non-coding sequences” refer to nucleotide sequences locateddownstream of a coding sequence and include polyadenylation recognitionsequences and other sequences encoding regulatory signals capable ofaffecting mRNA processing or gene expression. The polyadenylation signalis usually characterized by affecting the addition of polyadenylic acidtracts to the 3′ end of the mRNA precursor. The use of different 3′non-coding sequences is exemplified by Ingelbrecht et al. (1989) PlantCell 1:671-680.

[0050] “RNA transcript” refers to the product resulting from RNApolymerase-catalyzed transcription of a DNA sequence. When the RNAtranscript is a perfect complementary copy of the DNA sequence, it isreferred to as the primary transcript or it may be a RNA sequencederived from posttranscriptional processing of the primary transcriptand is referred to as the mature RNA. “Messenger RNA (mRNA)” refers tothe RNA that is without introns and that can be translated intopolypeptides by the cell. “cDNA” refers to DNA that is complementary toand derived from an mRNA template. The cDNA can be single-stranded orconverted to double stranded form using, for example, the Klenowfragment of DNA polymerase I. “Sense-RNA” refers to an RNA transcriptthat includes the mRNA and so can be translated into a polypeptide bythe cell. “Antisense RNA” refers to an RNA transcript that iscomplementary to ail or part of a target primary transcript or mRNA andthat blocks the expression of a target gene (see U.S. Pat. No.5,107,065, incorporated herein by reference). The complementarity of anantisense RNA may be with any part of the specific nucleotide sequence,i.e., at the 5′ non-coding sequence, 3′ non-coding sequence, introns, orthe coding sequence. “Functional RNA” refers to sense RNA, antisenseRNA, ribozyme RNA, or other RNA that may not be translated but yet hasan effect on cellular processes.

[0051] The term “operably linked” refers to the association of two ormore nucleic acid fragments on a single polynucleotide so that thefunction of one is affected by the other. For example, a promoter isoperably linked with a coding sequence when it is capable of affectingthe expression of that coding sequence (i.e., that the coding sequenceis under the transcriptional control of the promoter). Coding sequencescan be operably linked to regulatory sequences in sense or antisenseorientation.

[0052] The term “expression”, as used herein, refers to thetranscription and stable accumulation of sense (mRNA) or antisense RNAderived from the nucleic acid fragment of the invention. Expression mayalso refer to translation of mRNA into a polypeptide. “Antisenseinhibition” refers to the production of antisense RNA transcriptscapable of suppressing the expression of the target protein.“Overexpression” refers to the production of a gene product intransgenic organisms that exceeds levels of production in normal ornon-transformed organisms. “Co-suppression” refers to the production ofsense RNA transcripts capable of suppressing the expression of identicalor substantially similar foreign or endogenous genes (U.S. Pat. No.5,231,020, incorporated herein by reference).

[0053] A “protein” or “polypeptide” is a chain of amino acids arrangedin a specific order determined by the coding sequence in apolynucleotide encoding the polypeptide. Each protein or polypeptide hasa unique function.

[0054] “Altered levels” or “altered expression” refers to the productionof gene product(s) in transgenic organisms in amounts or proportionsthat differ from that of normal or non-transformed organisms.

[0055] “Mature protein” or the term “mature” when used in describing aprotein refers to a post-translationally processed polypeptide; i.e.,one from which any pre- or propeptides present in the primarytranslation product have been removed. “Precursor protein” or the term“precursor” when used in describing a protein refers to the primaryproduct of translation of mRNA; i.e., with pre- and propeptides stillpresent. Pre- and propeptides may be but are not limited tointracellular localization signals.

[0056] A “chloroplast transit peptide” is an amino acid sequence whichis translated in conjunction with a protein and directs the protein tothe chloroplast or other plastid types present in the cell in which theprotein is made. “Chloroplast transit sequence” refers to a nucleotidesequence that encodes a chloroplast transit peptide. A “signal peptide”is an amino acid sequence which is translated in conjunction with aprotein and directs the protein to the secretary system (Chrispeels(1991) Ann. Rev. Plant Phys. Plant Mol. Biol. 42:21-53). If the proteinis to be directed to a vacuole, a vacuolar targeting signal (supra) canfurther be added, or if to the endoplasmic reticulum, an endoplasmicreticulum retention signal (supra) may be added. If the protein is to bedirected to the nucleus, any signal peptide present should be removedand instead a nuclear localization signal included (Raikhel (1992) PlantPhys. 100:1627-1632).

[0057] “Transformation” refers to the transfer of a nucleic acidfragment into the genome of a host organism, resulting in geneticallystable inheritance. Host organisms containing the transformed nucleicacid fragments are referred to as “transgenic” organisms. Examples ofmethods of plant transformation include Agrobacterium-mediatedtransformation (De Blaere et al. (1987) Meth. Enzymol. 143:277) andparticle-accelerated or “gene gun” transformation technology (Klein etal. (1987) Nature (London) 327:70-73; U.S. Pat. No. 4,945,050,incorporated herein by reference). Thus, isolated polynucleotides of thepresent invention can be incorporated into recombinant constructs,typically DNA constructs, capable of introduction into and replicationin a host cell. Such a construct can be a vector that includes areplication system and sequences that are capable of transcription andtranslation of a polypeptide-encoding sequence in a given host cell. Anumber of vectors suitable for stable transfection of plant cells or forthe establishment of transgenic plants have been described in, e.g.,Pouwels et al., Cloning Vectors: A Laboratory Manual, 1985, supp. 1987;Weissbach and Weissbach, Methods for Plant Molecular Biology, AcademicPress, 1989; and Flevin et al., Plant Molecular Biology Manual, KluwerAcademic Publishers, 1990. Typically, plant expression vectors include,for example, one or more cloned plant genes under the transcriptionalcontrol of 5′ and 3′ regulatory sequences and a dominant selectablemarker. Such plant expression vectors also can contain a promoterregulatory region (e.g., a regulatory region controlling inducible orconstitutive, environmentally- or developmentally-regulated, or cell- ortissue-specific expression), a transcription initiation start site, aribosome binding site, an RNA processing signal, a transcriptiontermination site, and/or a polyadenylation signal.

[0058] Standard recombinant DNA and molecular cloning techniques usedherein are well known in the art and are described more fully inSambrook et al. Molecular Cloning: A Laboratory Manual; Cold SpringHarbor Laboratory Press: Cold Spring Harbor, 1989 (hereinafter“Maniatis”).

[0059] “PCR” or “polymerase chain reaction” is well known by thoseskilled in the art as a technique used for the amplification of specificDNA segments (U.S. Pat. Nos. 4,683,195 and 4,800,159).

[0060] The present invention concerns isolated polynucleotidescomprising a nucleotide sequence selected from the group consisting of:(a) a nucleotide sequence encoding a polypeptide of at least 60 aminoacids having at least 80% identity based on the Clustal method ofalignment when compared to a polypeptide selected from the groupconsisting of SEQ ID NOs:2, 4, 6, 43, 45, 47, 49, and 51; (b) anucleotide sequence encoding a polypeptide of at least 60 amino acidshaving at least 95% identity based on the Clustal method of alignmentwhen compared to a polypeptide selected from the group consisting of SEQID NOs:9, 11, 13, 15, 17, 19, 54 and 56; (c) a nucleotide sequenceencoding a polypeptide of at least 60 amino acids having at least 80%identity based on the Clustal method of alignment when compared to apolypeptide selected from the group consisting of SEQ ID NOs:22, 24, 26,28, and 59; (d) a nucleotide sequence encoding a polypeptide of at least60 amino acids having at least 95% identity based on the Clustal methodof alignment when compared to a polypeptide selected from the groupconsisting of SEQ ID NOs:31, 62, and 64; and (e) a nucleotide sequenceencoding a polypeptide of at least 60 amino acids having at least 85%identity based on the Clustal method of alignment when compared to apolypeptide selected from the group consisting of SEQ ID NOs:34, 36, 38,40, 68, 70, and 72. It is preferred that the identity be at least 85%,it is preferable if the identity is at least 90%, it is more preferredthat the identity be at least 95%. This invention also relates to theisolated complement of such polynucleotides, wherein the complement andthe polynucleotide consist of the same number of nucleotides, and thenucleotide sequences of the complement and the polynucleotide have 100%complementarity.

[0061] Preferably, the isolated polynucleotide of the claimed inventioncomprises a nucleotide sequence selected from the group consisting ofSEQ ID NOs:1, 3, 5, 42, 44, 46, 48, 50, 8, 10, 12, 14, 16, 18, 53, 55,21, 23, 25, 27, 58, 30, 61, 63, 33, 35, 37, 39, 67, 69, and 71.

[0062] Nucleic acid fragments encoding at least a portion of severalplant amino acid biosynthetic enzymes have been isolated and identifiedby comparison of random plant cDNA sequences to public databasescontaining nucleotide and protein sequences using the BLAST algorithmswell known to those skilled in the art. The nucleic acid fragments ofthe instant invention may be used to isolate cDNAs and genes encodinghomologous proteins from the same or other plant species. Isolation ofhomologous genes using sequence-dependent protocols is well known in theart. Examples of sequence-dependent protocols include, but are notlimited to, methods of nucleic acid hybridization, and methods of DNAand RNA amplification as exemplified by various uses of nucleic acidamplification technologies (e.g., polymerase chain reaction, ligasechain reaction).

[0063] For example, genes encoding other aspartic semialdehydedehydrogenases, diaminopimelate decarboxylases, homoserine kinases,cysteine γ synthases or cystathionine β-lyases, either as cDNAs orgenomic DNAs, could be isolated directly by using all or a portion ofthe instant nucleic acid fragments as DNA hybridization probes to screenlibraries from any desired plant employing methodology well known tothose skilled in the art. Specific oligonucleotide probes based upon theinstant nucleic acid sequences can be designed and synthesized bymethods known in the art (Maniatis). Moreover, an entire sequence can beused directly to synthesize DNA probes by methods known to the skilledartisan such as random primer DNA labeling, nick translation,end-labeling techniques, or RNA probes using available in vitrotranscription systems. In addition, specific primers can be designed andused to amplify a part or all of the instant sequences. The resultingamplification products can be labeled directly during amplificationreactions or labeled after amplification reactions, and used as probesto isolate full length cDNA or genomic fragments under conditions ofappropriate stringency.

[0064] In addition, two short segments of the instant nucleic acidfragments may be used in polymerase chain reaction protocols to amplifylonger nucleic acid fragments encoding homologous genes from DNA or RNA.The polymerase chain reaction may also be performed on a library ofcloned nucleic acid fragments wherein the sequence of one primer isderived from the instant nucleic acid fragments, and the sequence of theother primer takes advantage of the presence of the polyadenylic acidtracts to the 3′ end of the mRNA precursor encoding plant genes.Alternatively, the second primer sequence may be based upon sequencesderived from the cloning vector. For example, the skilled artisan canfollow the RACE protocol (Frohman et al. (1988) Proc. Natl. Acad. Sci.USA 85:8998-9002) to generate cDNAs by using PCR to amplify copies ofthe region between a single point in the transcript and the 3′ or 5′end. Primers oriented in the 3′ and 5′ directions can be designed fromthe instant sequences. Using commercially available 3′ RACE or 5′ RACEsystems (BRL), specific 3′ or 5′ cDNA fragments can be isolated (Oharaet al. (1989) Proc. Natl. Acad. Sci. USA 86:5673-5677; Loh et al. (1989)Science 243:217-220). Products generated by the 3′ and 5′ RACEprocedures can be combined to generate full-length cDNAs (Frohman andMartin (1989) Techniques 1:165). Consequently, a polynucleotidecomprising a nucleotide sequence of at least 30 (preferably at least 40,most preferably at least 60) contiguous nucleotides derived from anucleotide sequence selected from the group consisting of SEQ ID NOs:1,3, 5, 42, 44, 46, 48, 50, 8, 10, 12, 14, 16, 18, 59, 61, 21, 23, 25, 27,64, 30, 33, 35, 37, 39, 53, 55, and 57 and the complement of suchnucleotide sequences may be used in such methods to obtain a nucleicacid fragment encoding a substantial portion of an amino acid sequenceof a polypeptide.

[0065] The present invention relates to a method of obtaining a nucleicacid fragment encoding a substantial portion of an aspartic semialdehydedehydrogenase, diaminopimelate decarboxylase, homoserine kinase,cysteine synthase, or cystathionine β-lyase polypeptide, preferably asubstantial portion of a plant aspartic semialdehyde dehydrogenase,diaminopimelate decarboxylase, homoserine kinase, cysteine synthase, orcystathionine β-lyase polypeptide, comprising the steps of: synthesizingan oligonucleotide primer comprising a nucleotide sequence of at least30 (preferably at least 40, most preferably at least 60) contiguousnucleotides derived from a nucleotide sequence selected from the groupconsisting of SEQ ID NOs:1, 3, 5, 42, 44, 46, 48, 50, 8, 10, 12, 14, 16,18, 53, 55, 21, 23, 25, 27, 58, 30, 61, 63, 33, 35, 37, 39, 67, 69, and71, and the complement of such nucleotide sequences; and amplifying anucleic acid fragment (preferably a cDNA inserted in a cloning vector)using the oligonucleotide primer. The amplified nucleic acid fragmentpreferably will encode a portion of an aspartic semialdehydedehydrogenase, diaminopimelate decarboxylase, homoserine kinase,cysteine synthase, or cystathionine β-lyase polypeptide.

[0066] Availability of the instant nucleotide and deduced amino acidsequences facilitates immunological screening of cDNA expressionlibraries. Synthetic peptides representing portions of the instant aminoacid sequences may be synthesized. These peptides can be used toimmunize animals to produce polyclonal or monoclonal antibodies withspecificity for peptides or proteins comprising the amino acidsequences. These antibodies can be then be used to screen cDNAexpression libraries to isolate full-length cDNA clones of interest(Lerner (1984) Adv. Immunol. 36:1-34; Maniatis).

[0067] In another embodiment, this invention concerns viruses and hostcells comprising either the chimeric genes of the invention as describedherein or an isolated polynucleotide of the invention as describedherein. Examples of host cells which can be used to practice theinvention include, but are not limited to, yeast, bacteria, and plants.

[0068] As was noted above, the nucleic acid fragments of the instantinvention may be used to create transgenic plants in which the disclosedpolypeptides are present at higher or lower levels than normal or incell types or developmental stages in which they are not normally found.This would have the effect of altering the level of free amino acids inthose cells. Specifically, the enzymes of the present invention formpart of the pathway towards the biosynthesis of lysine, threonine,methionine, cysteine and isoleucine. In particular, altering the leveland/or function of cystathionine beta-lyase will result in changes inthe rate of methionine biosynthesis. Altering the level and/or functionof diaminopimelate decarboxylase will result in changes in the rate oflysine biosynthesis. Altering the level and/or function ofaspartate-semialdehyde dehydrogenase will result in changes in thelysine, methionine, or threonine content, especially in wheat. Alteringthe level of cysteine γ synthase will result in changes in the rate ofcysteine and/or methionine biosynthesis; using this gene it will also bepossible to control sulfur metabolism. Altering the level of homoserinekinase may be used to regulate threonine and methionine levels.Polypeptides encoding at least a portion of aspartic semialdehydedehydrogenase, diaminopimelate decarboxylase, homoserine kinase,cysteine synthase, or cystathionine β-lyase may also be used inherbicide identification and design.

[0069] Overexpression of the proteins of the instant invention may beaccomplished by first constructing a chimeric gene in which the codingregion is operably linked to a promoter capable of directing expressionof a gene in the desired tissues at the desired stage of development.The chimeric gene may comprise promoter sequences and translation leadersequences derived from the same genes. 3′ Non-coding sequences encodingtranscription termination signals may also be provided. The instantchimeric gene may also comprise one or more introns in order tofacilitate gene expression.

[0070] Plasmid vectors comprising the instant isolated polynucleotide(or chimeric gene) may be constructed. The choice of plasmid vector isdependent upon the method that will be used to transform host plants.The skilled artisan is well aware of the genetic elements that must bepresent on the plasmid vector in order to successfully transform, selectand propagate host cells containing the chimeric gene. The skilledartisan will also recognize that different independent transformationevents will result in different levels and patterns of expression (Joneset al. (1985) EMBO J. 4:2411-2418; De Almeida et al. (1989) Mol. Gen.Genetics 218:78-86), and thus that multiple events must be screened inorder to obtain lines displaying the desired expression level andpattern. Such screening may be accomplished by Southern analysis of DNA,Northern analysis of mRNA expression, Western analysis of proteinexpression, or phenotypic analysis.

[0071] For some applications it may be useful to direct the instantpolypeptides to different cellular compartments, or to facilitate itssecretion from the cell. It is thus envisioned that the chimeric genedescribed above may be further supplemented by directing the codingsequence to encode the instant polypeptides with appropriateintracellular targeting sequences such as transit sequences (Keegstra(1989) Cell 56:247-253), signal sequences or sequences encodingendoplasmic reticulum localization (Chrispeels (1991) Ann. Rev. PlantPhys. Plant Mol. Biol. 42:21-53), or nuclear localization signals(Raikhel (1992) Plant Phys. 100: 1627-1632) with or without removingtargeting sequences that are already present. While the references citedgive examples of each of these, the list is not exhaustive and moretargeting signals of use may be discovered in the future.

[0072] It may also be desirable to reduce or eliminate expression ofgenes encoding the instant polypeptides in plants for some applications.In order to accomplish this, a chimeric gene designed for co-suppressionof the instant polypeptide can be constructed by linking a gene or genefragment encoding that polypeptide to plant promoter sequences.Alternatively, a chimeric gene designed to express antisense RNA for allor part of the instant nucleic acid fragment can be constructed bylinking the gene or gene fragment in reverse orientation to plantpromoter sequences. Either the co-suppression or antisense chimericgenes could be introduced into plants via transformation whereinexpression of the corresponding endogenous genes are reduced oreliminated.

[0073] Molecular genetic solutions to the generation of plants withaltered gene expression have a decided advantage over more traditionalplant breeding approaches. Changes in plant phenotypes can be producedby specifically inhibiting expression of one or more genes by antisenseinhibition or cosuppression (U.S. Pat. Nos. 5,190,931, 5,107,065 and5,283,323). An antisense or cosuppression construct would act as adominant negative regulator of gene activity. While conventionalmutations can yield negative regulation of gene activity these effectsare most likely recessive. The dominant negative regulation availablewith a transgenic approach may be advantageous from a breedingperspective. In addition, the ability to restrict the expression of aspecific phenotype to the reproductive tissues of the plant by the useof tissue specific promoters may confer agronomic advantages relative toconventional mutations which may have an effect in all tissues in whicha mutant gene is ordinarily expressed.

[0074] The person skilled in the art will know that specialconsiderations are associated with the use of antisense or cosuppressiontechnologies in order to reduce expression of particular genes. Forexample, the proper level of expression of sense or antisense genes mayrequire the use of different chimeric genes utilizing differentregulatory elements known to the skilled artisan. Once transgenic plantsare obtained by one of the methods described above, it will be necessaryto screen individual transgenics for those that most effectively displaythe desired phenotype. Accordingly, the skilled artisan will developmethods for screening large numbers of transformants. The nature ofthese screens will generally be chosen on practical grounds. Forexample, one can screen by looking for changes in gene expression byusing antibodies specific for the protein encoded by the gene beingsuppressed, or one could establish assays that specifically measureenzyme activity. A preferred method will be one which allows largenumbers of samples to be processed rapidly, since it will be expectedthat a large number of transformants will be negative for the desiredphenotype.

[0075] In another embodiment, the present invention concerns anaspartic-semialdehyde dehydrogenase polypeptide of at least 50 aminoacids comprising at least 70% identity based on the Clustal method ofalignment compared to a polypeptide selected from the group consistingof SEQ ID NOs:2, 4, 6, 43, 45, 47, 49, and 51, a diaminopimelatedecarboxylase polypeptide of at least 60 amino acids comprising at least95% identity based on the Clustal method of alignment compared to apolypeptide selected from the group consisting of SEQ ID NOs:9, 11, 13,15, 17, 19, 60, and 62, a homoserine kinase polypeptide of at least 60amino acids comprising at least 70% identity based on the Clustal methodof alignment compared to a polypeptide selected from the groupconsisting of SEQ ID NOs:22, 24, 26, 28, and 65, a cysteine synthasepolypeptide of at least 60 amino acids comprising at least 90% identitybased on the Clustal method of alignment compared to a polypeptide ofSEQ ID NO:31, or a cystathionine β-lyase polypeptide of at least 60amino acids comprising at least 85% identity based on the Clustal methodof alignment compared to a polypeptide selected from the groupconsisting of SEQ ID NOs:34, 36, 38, 40, 54, 56, and 58.

[0076] The instant polypeptides (or portions thereof) may be produced inheterologous host cells, particularly in the cells of microbial hosts,and can be used to prepare antibodies to these proteins by methods wellknown to those skilled in the art. The antibodies are useful fordetecting the polypeptides of the instant invention in situ in cells orin vitro in cell extracts. Preferred heterologous host cells forproduction of the instant polypeptides are microbial hosts. Microbialexpression systems and expression vectors containing regulatorysequences that direct high level expression of foreign proteins are wellknown to those skilled in the art. Any of these could be used toconstruct a chimeric gene for production of the instant polypeptides.This chimeric gene could then be introduced into appropriatemicroorganisms via transformation to provide high level expression ofthe encoded plant biosynthetic enzymes. An example of a vector for highlevel expression of the instant polypeptides in a bacterial host isprovided (Example 10).

[0077] Additionally, the instant polypeptides can be used as a target tofacilitate design and/or identification of inhibitors of those enzymesthat may be useful as herbicides. This is desirable because thepolypeptides described herein catalyze various steps in a pathwayleading to production of several essential amino acids. Accordingly,inhibition of the activity of one or more of the enzymes describedherein could lead to inhibition of plant growth. Thus, the instantpolypeptides could be appropriate for new herbicide discovery anddesign.

[0078] All or a substantial portion of the polynucleotides of theinstant invention may also be used as probes for genetically andphysically mapping the genes that they are a part of, and used asmarkers for traits linked to those genes. Such information may be usefulin plant breeding in order to develop lines with desired phenotypes. Forexample, the instant nucleic acid fragments may be used as restrictionfragment length polymorphism (RFLP) markers. Southern blots (Maniatis)of restriction-digested plant genomic DNA may be probed with the nucleicacid fragments of the instant invention. The resulting banding patternsmay then be subjected to genetic analyses using computer programs suchas MapMaker (Lander et al. (1987) Genomics 1:174-181) in order toconstruct a genetic map. In addition, the nucleic acid fragments of theinstant invention may be used to probe Southern blots containingrestriction endonuclease-treated genomic DNAs of a set of individualsrepresenting parent and progeny of a defined genetic cross. Segregationof the DNA polymorphisms is noted and used to calculate the position ofthe instant nucleic acid sequence in the genetic map previously obtainedusing this population (Botstein et al. (1980) Am. J. Hum. Genet.32:314-331).

[0079] The production and use of plant gene-derived probes for use ingenetic mapping is described in Bernatzky and Tanksley (1986) Plant Mol.Biol. Reporter 4:37-41. Numerous publications describe genetic mappingof specific cDNA clones using the methodology outlined above orvariations thereof. For example, F2 intercross populations, backcrosspopulations, randomly mated populations, near isogenic lines, and othersets of individuals may be used for mapping. Such methodologies are wellknown to those skilled in the art.

[0080] Nucleic acid probes derived from the instant nucleic acidsequences may also be used for physical mapping (i.e., placement ofsequences on physical maps; see Hoheisel et al. In: Nonmammalian GenomicAnalysis: A Practical Guide, Academic press 1996, pp. 319-346, andreferences cited therein).

[0081] In another embodiment, nucleic acid probes derived from theinstant nucleic acid sequences may be used in direct fluorescence insitu hybridization (FISH) mapping (Trask (1991) Trends Genet.7:149-154). Although current methods of FISH mapping favor use of largeclones (several to several hundred KB; see Laan et al. (1995) GenomeRes. 5:13-20), improvements in sensitivity may allow performance of FISHmapping using shorter probes.

[0082] A variety of nucleic acid amplification-based methods of geneticand physical mapping may be carried out using the instant nucleic acidsequences. Examples include allele-specific amplification (Kazazian(1989) J. Lab. Clin. Med. 11:95-96), polymorphism of PCR-amplifiedfragments (CAPS; Sheffield et al. (1993) Genomics 16:325-332),allele-specific ligation (Landegren et al. (1988) Science241:1077-1080), nucleotide extension reactions (Sokolov (1990) NucleicAcid Res. 18:3671), Radiation Hybrid Mapping (Walter et al. (1997) Nat.Genet. 7:22-28) and Happy Mapping (Dear and Cook (1989) Nucleic AcidRes. 17:6795-6807). For these methods, the sequence of a nucleic acidfragment is used to design and produce primer pairs for use in theamplification reaction or in primer extension reactions. The design ofsuch primers is well known to those skilled in the art. In methodsemploying PCR-based genetic mapping, it may be necessary to identify DNAsequence differences between the parents of the mapping cross in theregion corresponding to the instant nucleic acid sequence. This,however, is generally not necessary for mapping methods.

[0083] Loss of function mutant phenotypes may be identified for theinstant cDNA clones either by targeted gene disruption protocols or byidentifying specific mutants for these genes contained in a maizepopulation carrying mutations in all possible genes (Ballinger andBenzer (1989) Proc. Natl. Acad. Sci USA 86:9402-9406; Koes et al. (1995)Proc. Natl. Acad. Sci USA 92:8149-8153; Bensen et al. (1995) Plant Cell7:75-84). The latter approach may be accomplished in two ways. First,short segments of the instant nucleic acid fragments may be used inpolymerase chain reaction protocols in conjunction with a mutation tagsequence primer on DNAs prepared from a population of plants in whichMutator transposons or some other mutation-causing DNA element has beenintroduced (see Bensen, supra). The amplification of a specific DNAfragment with these primers indicates the insertion of the mutation tagelement in or near the plant gene encoding the instant polypeptides.Alternatively, the instant nucleic acid fragment may be used as ahybridization probe against PCR amplification products generated fromthe mutation population using the mutation tag sequence primer inconjunction with an arbitrary genomic site primer, such as that for arestriction enzyme site-anchored synthetic adaptor. With either method,a plant containing a mutation in the endogenous gene encoding theinstant polypeptides can be identified and obtained. This mutant plantcan then be used to determine or confirm the natural function of theinstant polypeptides disclosed herein.

EXAMPLES

[0084] The present invention is further defined in the followingExamples, in which parts and percentages are by weight and degrees areCelsius, unless otherwise stated. It should be understood that theseExamples, while indicating preferred embodiments of the invention, aregiven by way of illustration only. From the above discussion and theseExamples, one skilled in the art can ascertain the essentialcharacteristics of this invention, and without departing from the spiritand scope thereof, can make various changes and modifications of theinvention to adapt it to various usages and conditions. Thus, variousmodifications of the invention in addition to those shown and describedherein will be apparent to those skilled in the art from the foregoingdescription. Such modifications are also intended to fall within thescope of the appended claims.

[0085] The disclosure of each reference set forth herein is incorporatedherein by reference in its entirety.

Example 1 Composition of cDNA Libraries: Isolation and Sequencing ofcDNA Clones

[0086] cDNA libraries representing mRNAs from various corn, rice,soybean, and wheat tissues were prepared. The characteristics of thelibraries are described below. TABLE 2 cDNA Libraries from Corn, Rice,Soybean, and Wheat Library Tissue Clone cen1 Corn Endosperm 12 Dayscen1.pk0061.d4 After Pollination cen3n Corn Endosperm 20 Dayscen3n.pk0067.a3 After Pollination* cpe1c Corn pooled BMS treated withcpe1c.pk009.b24 chemicals related to phosphatase** cr1n Corn Root From 7Day Seedlings* cr1n.pk0009.g4 cr1n Corn Root From 7 Day Seedlings*cr1n.pk0103.d8 p0003 Corn Premelotic Ear Shoot, 0.2-4 cmp0003.cgpha22r:fis p0005 Corn Immature Ear p0005.cbmei71r p0014 CornLeaves 7 and 8 from Plant p0014.ctuui39r Transformed with G-proteinGene, C. heterostrophus Resistant p0016 Corn Tassel Shoots (0.1-1.4 cm),p0016.ctscp83r Pooled p0075 Corn Shoot And Leaf Material Fromp0075.cslab16r Dark-Grown 7 Day-Old Seedlings p0109 Corn Leaves FromLes9 Transition p0109.cdadg47r Zone and Les9 Mature Lesions, Pooled***p0125 Corn Anther Prophase I* p0125.czaay16r rca1c Rice NipponbareCallus rca1c.pk005.k3 r10n Rice Leaf 15 Days After r10n.pk0013.b9Germination* rlr12 Rice Leaf 15 Days After rln12.pk0026.g1 Germination,12 Hours After Infection of Strain Magaporthe grisea 4360-R-62(AVR2-YAMO) rlr48 Rice Leaf 15 Days After rlr48.pk0003.d12 Germination48 Hours After Infection of Strain Magaporthe grisea 4360-R-62(AVR2-YAMO) se3 Soybean Embryo 13 Days sdp3c.pk001.o15 After Floweringsdp3c Soybean Developing Pods 8-9 mm se3.05h06 ses8w Mature SoybeanEmbryo 8 Weeks ses8w.pk0020.b5 After Subculture ses9c SoybeanEmbryogenic Suspension ses9c.pk001.a15:fis sfl1 Soybean Immature Flowersfl1.pk0012.c4 sfl1 Soybean Immature Flower sf1.pk0122.f9 sr1 SoybeanRoot From 10 Day sr1.pk0132.c1 Old Seedlings wdk1c Wheat DevelopingKernel, wdk1c.pk014.n5:fis 3 Days After Anthesis wl1n Wheat Leaf from 7Day wl1n.pk0065.f2 Old Seedling* wlk1 Wheat Seedlings 1 hourwlk1.pk0012.c2 After Fungicide Treatment**** wr1 Wheat Root From 7 Daywr1.pk0004.c11 Old Seedlings wr1 Wheat Root From 7 Day wr1.pk0091.g6 OldSeedlings

[0087] cDNA libraries may be prepared by any one of many methodsavailable. For example, the cDNAs may be introduced into plasmid vectorsby first preparing the cDNA libraries in Uni-ZAP™ XR vectors accordingto the manufacturer's protocol (Stratagene Cloning Systems, La Jolla,Calif.). The Uni-ZAP™ XR libraries are converted into plasmid librariesaccording to the protocol provided by Stratagene. Upon conversion, cDNAinserts will be contained in the plasmid vector pBluescript. Inaddition, the cDNAs may be introduced directly into precut Bluescript IISK(+) vectors (Stratagene) using T4 DNA ligase (New England Biolabs),followed by transfection into DH10B cells according to themanufacturer's protocol (GIBCO BRL Products). Once the cDNA inserts arein plasmid vectors, plasmid DNAs are prepared from randomly pickedbacterial colonies containing recombinant pBluescript plasmids, or theinsert cDNA sequences are amplified via polymerase chain reaction usingprimers specific for vector sequences flanking the inserted cDNAsequences. Amplified insert DNAs or plasmid DNAs are sequenced indye-primer sequencing reactions to generate partial cDNA sequences(expressed sequence tags or “ESTs”; see Adams et al., (1991) Science252:1651-1656). The resulting ESTs are analyzed using a Perkin ElmerModel 377 fluorescent sequencer.

[0088] Full-insert sequence (FIS) data is generated utilizing a modifiedtransposition protocol. Clones identified for FIS are recovered fromarchived glycerol stocks as single colonies, and plasmid DNAs areisolated via alkaline lysis. Isolated DNA templates are reacted withvector primed M13 forward and reverse oligonucleotides in a PCR-basedsequencing reaction and loaded onto automated sequencers. Confirmationof clone identification is performed by sequence alignment to theoriginal EST sequence from which the FIS request is made.

[0089] Confirmed templates are transposed via the Primer Islandtransposition kit (PE Applied Biosystems, Foster City, Calif.) which isbased upon the Saccharomyces cerevisiae Ty1 transposable element (Devineand Boeke (1994) Nucleic Acids Res. 22:3765-3772). The in vitrotransposition system places unique binding sites randomly throughout apopulation of large DNA molecules. The transposed DNA is then used totransform DH10B electro-competent cells (Gibco BRL/Life Technologies,Rockville, Md.) via electroporation. The transposable element containsan additional selectable marker (named DHFR; Fling and Richards (1983)Nucleic Acids Res. 11:5147-5158), allowing for dual selection on agarplates of only those subclones containing the integrated transposon.Multiple subclones are randomly selected from each transpositionreaction, plasmid DNAs are prepared via alkaline lysis, and templatesare sequenced (ABI Prism dye-terminator ReadyReaction mix) outward fromthe transposition event site, utilizing unique primers specific to thebinding sites within the transposon.

[0090] Sequence data is collected (ABI Prism Collections) and assembledusing Phred/Phrap (P. Green, University of Washington, Seattle).Phrep/Phrap is a public domain software program which re-reads the ABIsequence data, re-calls the bases, assigns quality values, and writesthe base calls and quality values into editable output files. The Phrapsequence assembly program uses these quality values to increase theaccuracy of the assembled sequence contigs. Assemblies are viewed by theConsed sequence editor (D. Gordon, University of Washington, Seattle).

Example 2 Identification of cDNA Clones

[0091] cDNA clones encoding plant amino acid biosynthetic enzymes wereidentified by conducting BLAST (Basic Local Alignment Search Tool;Altschul et al. (1993) J. Mol. Biol. 215:403-410; see alsowww.ncbi.nlm.nih.gov/BLAST/) searches for similarity to sequencescontained in the BLAST “nr” database (comprising all non-redundantGenBank CDS translations, sequences derived from the 3-dimensionalstructure Brookhaven Protein Data Bank, the last major release of theSWISS-PROT protein sequence database, EMBL, and DDBJ databases). ThecDNA sequences obtained in Example 1 were analyzed for similarity to allpublicly available DNA sequences contained in the “nr” database usingthe BLASTN algorithm provided by the National Center for BiotechnologyInformation (NCBI). The DNA sequences were translated in all readingframes and compared for similarity to all publicly available proteinsequences contained in the “nr” database using the BLASTX algorithm(Gish and States (1993) Nat. Genet. 3:266-272) provided by the NCBI. Forconvenience, the P-value (probability) of observing a match of a cDNAsequence to a sequence contained in the searched databases merely bychance as calculated by BLAST are reported herein as “pLog” values,which represent the negative of the logarithm of the reported P-value.Accordingly, the greater the pLog value, the greater the likelihood thatthe cDNA sequence and the BLAST “hit” represent homologous proteins.

[0092] ESTs submitted for analysis are compared to the genbank databaseas described above. ESTs that contain sequences more 5- or 3-prime canbe found by using the BLASTn algorithm (Altschul et al (1997) NucleicAcids Res. 25:3389-3402.) against the DuPont proprietary databasecomparing nucleotide sequences that share common or overlapping regionsof sequence homology. Where common or overlapping sequences existbetween two or more nucleic acid fragments, the sequences can beassembled into a single contiguous nucleotide sequence, thus extendingthe original fragment in either the 5 or 3 prime direction. Once themost 5-prime EST is identified, its complete sequence can be determinedby Full Insert Sequencing as described in Example 1. Homologous genesbelonging to different species can be found by comparing the amino acidsequence of a known gene (from either a proprietary source or a publicdatabase) against an EST database using the tBLASTn algorithm. ThetBLASTn algorithm searches an amino acid query against a nucleotidedatabase that is translated in all 6 reading frames. This search allowsfor differences in nucleotide codon usage between different species, andfor codon degeneracy.

Example 3 Characterization of cDNA Clones Encoding AspartateSemialdehyde Dehydrogenase

[0093] The BLASTX search using the EST sequences from clones listed inTable 3 revealed similarity of the polypeptides encoded by the cDNAs toaspartate semialdehyde dehydrogenase from Synechocystis sp. (DDJBAccession No. D64006; NCBI General Identifier No. 1001379) or Legionellapneumophila (GenBank Accession No. AF034213; NCBI General Identifier No.2645882). Shown in Table 3 are the BLAST results for individual ESTs(“EST”), or for the sequences of the entire cDNA inserts comprising theindicated cDNA clones (“FIS”): TABLE 3 BLAST Results for SequencesEncoding Polypeptides Homologous to Aspartate Semialdehyde DehydrogenaseBLAST pLog Score Synechocystis sp. Legionella pneumophila Clone StatusGI 1001379 GI 2645882 r1r48.pk0003.d12 FIS 51.00 36.00 wr1.pk0004.c11EST 67.96 44.74 sfl1.pk0122.f9 EST 6.60

[0094] The sequence of the entire cDNA insert in clone sfl1.pk0122.f9was determined, RACE PCR was used to obtain the 5′ portion of the riceaspartate semialdehyde dehydrogenase, and further sequencing andsearching of the DuPont proprietary database allowed the identificationof a corn and other a soybean, and wheat clones encoding aspartatesemialdehyde dehydrogenase. The BLASTX search using the EST sequencesfrom clones listed in Table 4 revealed similarity of the polypeptidesencoded by the cDNAs to aspartate semialdehyde dehydrogenase fromAquifex aeolicus (NCBI General Identifier No. 6225258). Shown in Table 4are the BLAST results for the sequences of contigs assembled from two ormore ESTs (“Contig”), or the sequences encoding the entire proteinderived from eithre the entire cDNA inserts comprising the indicatedcDNA clones or contigs assembled from 5′ RACE PCR and the sequence ofthe entire cDNA insert in the indicated cDNA clone (“CGS”): TABLE 4BLAST Results for Sequences Encoding Polypeptides Homologous toAspartate Semialdehyde Dehydrogenase BLAST pLog Score Clone StatusAguifex aeolicus GI 6225258 Contig of: Contig 78.70 cpe1c.pk009.b24p0003.cgpha22r:fis p0016.ctscp83r p0075.cslab16r 5′RACE PCR + CGS 89.20r1r48.pk0003.d12:fis ses9c.pk001.a15:fis CGS 87.40 sfl1.pk0122.f9:fisCGS 88.10 wdk1c.pk014.n5:fis CGS 91.50

[0095]FIG. 2 presents an alignment of the amino acid sequences set forthin SEQ ID NOs:2, 4, 6, 43, 45, 47, 49, and 51 with the Legionellapneumophila sequence (NCBI General Identifier No. 2645882; SEQ ID NO:7)and the Aquifex aeolicus sequence (NCBI General Identifier No. 6225258;SEQ ID NO:52). The data in Table 5 presents a calculation of the percentidentity of the amino acid sequences set forth in SEQ ID NOs:2, 4, 6,43, 45, 47, 49, and 51 with the Legionella pneumophila sequence (NCBIGeneral Identifier No. 2645882; SEQ ID NO:7) and the Aquifex aeolicussequence (NCBI General Identifier No. 6225258; SEQ ID NO:52). TABLE 5Percent Identity of Amino Acid Sequences Deduced From the NucleotideSequences of cDNA Clones Encoding Polypeptides Homologous to AspartateSemialdehyde Dehydrogenase amino acid Percent Identity to Clone SEQ IDNO. 2645882 6225258 rlr48.pk0003.d12 2 42.1 45.6 wr1.pk0004.c11 4 42.344.8 sfl1.pk0122.f9 6 29.1 25.6 Contig of: 43 41.2 45.9 cpec.pk009.b24p0003.cgpha22r:fis p0016.ctscp83r p0075.cslab16r 5′ RACE PCR + 45 43.247.0 rlr48.pk0003.d12:fis ses9c.pk001.a15:fis 47 43.5 49.1sfl1.pk0122.f9:fis 49 41.2 45.6 wdklc.pk014.n5:fis 51 43.2 49.4

[0096] As seen in FIG. 2, the amino acid sequence shown in SEQ ID NO:2is identical to amino acids 181 through 375 of SEQ ID NO:45; thesequence shown in SEQ ID NO:4 is identical to amino acids 173 through374 of the sequence shown in SEQ ID NO:51; the sequence shown in SEQ IDNO:6 is identical to amino acids 1 through 86 of the sequence shown inSEQ ID NO:49; there are 5 amino acid differences between the sequencesshown in SEQ ID NO:47 and SEQ ID NO:49; there are 18 amino aciddifferences between amino acids 89 through 375 of the sequence shown inSEQ ID NO:43 and the sequence shown in SEQ ID NO:45; and there are 15differences between the amino acid sequences shown in SEQ ID NO:45 andin SEQ ID NO:51.

[0097] Sequence alignments and percent identity calculations wereperformed using the Megalign program of the LASERGENE bioinformaticscomputing suite (DNASTAR Inc., Madison, Wis.). Multiple alignment of thesequences was performed using the Clustal method of alignment (Higginsand Sharp (1989) CABIOS. 5:151-153) with the default parameters (GAPPENALTY=10, GAP LENGTH PENALTY=10). Default parameters for pairwisealignments using the Clustal method were KTUPLE 1, GAP PENALTY=3,WINDOW=5 and DIAGONALS SAVED=5. Sequence alignments and BLAST scores andprobabilities indicate that the nucleic acid fragments comprising theinstant cDNA clones encode a substantial portion of a corn aspartatesemialdehyde dehydrogenase, a substantial portion and an entire riceaspartate semialdehyde dehydrogenase, a portion and an entire wheataspartate semialdehyde dehydrogenase, and a portion and an two entiresoybean aspartate semialdehyde dehydrogenases.

Example 4 Characterization of cDNA Clones Encoding DiaminopimelateDecarboxylase

[0098] The BLASTX search using the EST sequences from clones listed inTable 6 revealed similarity of the polypeptides encoded by the cDNAs todiaminopimelate decarboxylase from Aquifex aeolicus (GenBank AccessionNo. AE000728 and NCBI General Identifier No. 2983642) and Pseudomonasaeruginosa (GenBank Accession No. M23174 and NCBI General Identifier No.118304). Shown in Table 6 are the BLAST results for individual ESTs(“EST”), the sequences of the entire cDNA inserts comprising theindicated cDNA clones (“FIS”), or the sequences of FISs encoding anentire protein (“CGS”): TABLE 6 BLAST Results for Sequences EncodingPolypeptides Homologous to Diaminopimelate Decarboxylase BLAST pLogScore GI 2983642 GI 118304 Clone Status (A. aeolicus) (P. aeruginosa)cen3n.pk0067.a3 FIS 58.22 56.00 cr1n.pk0103.d8 CGS 75.25 79.12r10n.pk0013.b9 FIS 46.40 44.00 sr1.pk0132.c1 FIS 44.70 39.15wlk1.pk0012.c2 EST 20.48 19.05

[0099] An additional soybean clone, sdp3c.pk001.o15, was identified assharing homology with sr1.pk0132.c1. BLASTX search using the nucleotidesequences from clone sdp3c.pk001.o15 revealed similarity of the proteinsencoded by the cDNA to diaminopimelate decarboxylase from Pseudomonasfluorescens (EMBO Accession No. Y12268; NCBI General Identifier No.1929095). This EST yields a pLog value of 8.66 versus the Pseudomonasfluorescens sequence.

[0100] The sequence of the entire cDNA insert in clones sdp3c.pk001.o15and wlk1.pk0012.c2 was determined. The BLASTX search using the ESTsequences from clones listed in Table 7 revealed similarity of thepolypeptides encoded by the cDNAs to diaminopimelate decarboxylase fromAquifex aeolicus (NCBI General Identifier No. 6225241) or by theArabidopsis thaliana contig containing similarity with diaminopimelatedecarboxylases (NCBI General Identifier No. 9279586). Shown in Table 7are the BLAST results for the sequences of the entire cDNA insertscomprising the indicated cDNA clones (“FIS”), or the sequences of FISsencoding the entire protein (“CGS”): TABLE 7 BLAST Results for SequencesEncoding Polypeptides Homologous to Diaminopimelate Decarboxylase CloneStatus Homolog BLAST pLog Score sdp3c.pk001.o15:fis CGS GI 6225241 76.40(A. aeolicus) wlk1.pk0012.c2:fis FIS GI 9279586 94.40 (A. thaliana)

[0101]FIG. 3 presents an alignment of the amino acid sequences set forthin SEQ ID NOs:9, 11, 13, 15, 17, 19, 54, and 56 with the Pseudomonasaeruginosa sequence (NCBI General Identifier No. 118304; SEQ ID NO:20)and the Arabidopsis thaliana sequence (NCBI General Identifier No.9279586, SEQ ID NO:57). The data in Table 8 presents a calculation ofthe percent identity of the amino acid sequences set forth in SEQ IDNOs:9, 11, 13, 15, 17, 19, 54, and 56 with the Pseudomonas aeruginosasequence (NCBI General Identifier No. 118304; SEQ ID NO:20) and theArabidopsis thaliana sequence (NCBI General Identifier No. 9279586; SEQID NO:57). TABLE 8 Percent Identity of Amino Acid Sequences Deduced Fromthe Nucleotide Sequences of cDNA Clones Encoding Polypeptides Homologousto Diaminopimelate Decarboxylase Amino acid Percent Identity to CloneSEQ ID NO. 118304 9279586 cen3n.pk0067.a3 9 34.0 82.2 cr1n.pk0103.d8 1135.9 70.6 r10n.pk0013.b9 13 32.4 76.8 sr1.pk0132.c1 15 29.7 86.1wlk1.pk0012.c2 17 42.5 93.2 sdp3c.pk001.o15 19 41.9 87.1sdp3c.pk001.o15:fis 54 32.5 74.9 wlk1.pk0012.c2:fis 56 32. 84.9

[0102] The amino acid sequence set forth in SEQ ID NO:19 is identical toamino acids 112 through 173 of the amino acid sequence set forth in SEQID NO:54. The amino acid sequence set forth in SEQ ID NO:17 is identicalto amino acids 24 through 96 of the amino acid sequence set forth in SEQID NO:56.

[0103] Sequence alignments and percent identity calculations wereperformed using the Megalign program of the LASERGENE bioinfortnaticscomputing suite (DNASTAR Inc., Madison, Wis.). Multiple alignment of thesequences was performed using the Clustal method of alignment (Higginsand Sharp (1989) CABIOS. 5:151-153) with the default parameters (GAPPENALTY=10, GAP LENGTH PENALTY=10). Default parameters for pairwisealignments using the Clustal method were KTUPLE 1, GAP PENALTY=3,WINDOW=5 and DIAGONALS SAVED=5. Sequence alignments and BLAST scores andprobabilities indicate that the nucleic acid fragments comprising theinstant cDNA clones encode a substantial portion of one corn, one rice,two soybean and one wheat diaminopimelate decarboxylases and entire cornand soybean diaminopimelate decarboxylases.

Example 5 Characterization of cDNA Clones Encoding Homoserine Kinase

[0104] The BLASTX search using the EST sequences from clones listed inTable 9 revealed similarity of the polypeptides encoded by the cDNAs tohomoserine kinase from Methanococcus jannaschii (GenBank Accession No.U67553 and NCBI General Identifier No. 1591748). Shown in Table 9 arethe BLAST results for individual ESTs (“EST”) or for the sequences ofthe entire cDNA inserts comprising the indicated cDNA clones (“FIS”):TABLE 9 BLAST Results for Sequences Encoding Polypeptides Homologous toHomoserine Kinase BLAST pLog Score GI 1591748 Clone Status(Methanococcus jannaschii) cr1n.pk0009.g4 FIS 19.30 rca1c.pk005.k3 EST15.21 ses8w.pk0020.b5 FIS 35.30 wl1n.pk0065.f2 EST 5.68

[0105] The sequence of the entire cDNA insert in clone rcal c.pk005.k3was determined. The BLASTX search using the EST sequences from cloneslisted in Table 10 revealed similarity of the polypeptides encoded bythe cDNAs to homoserine kinase from Arabidopsis thaliana (NCBI GeneralIdentifier No. 4927412). Shown in Table 10 are the BLAST results for thesequences of the entire cDNA inserts comprising the indicated cDNA clone(“FIS”): TABLE 10 BLAST Results for Sequences Encoding PolypeptidesHomologous to Homoserine Kinase BLAST pLog Score Clone Status 4927412(Arabidopsis thaliana) rcalc.pk005.k3:fis FIS 88.40

[0106]FIG. 4 presents an alignment of the amino acid sequences set forthin SEQ ID NOs:22, 24, 26, 28, and 59 with the Methanococcus jannaschiisequence (NCBI General Identifier No. 1591748; SEQ ID NO:29) and theArabidopsis thaliana sequence (NCBI General Identifier No. 4927412; SEQID NO:60). The data in Table 11 presents a calculation of the percentidentity of the amino acid sequences set forth in SEQ ID NOs:22, 24, 26,28, and 59 with the Methanococcus jannaschii sequence (NCBI GeneralIdentifier No. 1591748; SEQ ID NO:29) and the Arabidopsis thalianasequence (NCBI General Identifier No. 4927412; SEQ ID NO:60). TABLE 11Percent Identity of Amino Acid Sequences Deduced From the NucleotideSequences of cDNA Clones Encoding Polypeptides Homologous to HomoserineKinase Percent Identity to NCBI GI Clone SEQ ID NO. 1591748 NCBI GI4927412 cr1n.pk0009.g4 22 25.1 65.4 rca1c.pk005.k3 24 48.8 67.1ses8w.pk0020.b5 26 28.0 65.7 w11n.pk0065.f2 28 29.8 67.9rca1c.pk005.k3:fis 59 28.6 65.9

[0107] The amino acid sequence set forth in SEQ ID NO:24 is identical toamino acids 18 through 99 of the amino acid sequence set forth in SEQ IDNO:59.

[0108] Sequence alignments and percent identity calculations wereperformed using the Megalign program of the LASERGENE bioinformaticscomputing suite (DNASTAR Inc., Madison, Wis.). Multiple alignment of thesequences was performed using the Clustal method of alignment (Higginsand Sharp (1989) CABIOS. 5:151-153) with the default parameters (GAPPENALTY=10, GAP LENGTH PENALTY=10). Default parameters for pairwisealignments using the Clustal method were KTUPLE 1, GAP PENALTY=3,WINDOW=5 and DIAGONALS SAVED=5. Sequence alignments and BLAST scores andprobabilities indicate that the nucleic acid fragments comprising theinstant cDNA clones encode a substantial portion of a corn and a wheathomoserine kinase, a portion and an entire rice homoserine kinase, andan entire soybean homoserine kinase.

Example 6 Characterization of cDNA Clones Encoding Cysteine Synthase

[0109] The BLASTX search using the EST sequences from the clone listedin Table 12 revealed similarity of the polypeptides encoded by the cDNAsto cysteine synthase from Citrullus lanatus (DDJB Accession No. D28777,NCBI General Identifier No. 540497). Shown in Table 12 are the BLASTresults for the sequences of the entire cDNA inserts comprising theindicated cDNA clones encoding the entire protein (“CGS”): TABLE 12BLAST Results for Sequences Encoding Polypeptides Homologous to Cysteineγ Synthase BLAST pLog Score Clone Status NCBI GI 540497 (Citrulluslanatus) se3.05h06 CGS 182.64

[0110] Further sequencing and searching of the DuPont proprietarydatabase allowed the identification of corn and rice clones encodingpolypeptides with similarites to cysteine γ synthase. The BLAST searchusing the sequences from clones listed in Table 13 revealed similarityof the polypeptides encoded by the cDNAs to cysteine 7 synthase fromSpinacia oleracea (NCBI General Identifier No. 416869) and Solanumtuberosum (NCBI General Identifier No. 11131628). Shown in Table 13 arethe BLAST results for the sequences of the entire cDNA insertscomprising the indicated cDNA clones encoding the entire protein(“CGS”): TABLE 13 BLAST Results for Sequences Encoding PolypeptidesHomologous to Cysteine γ Synthase BLAST pLog Score NCBI GI 416869 NCBIGI 11131628 Clone Status (Spinacia oleracea) (Solanum tuberosum) Contigof: CGS 158.00 157.00 cco1n.pk083.j4 chp2.pk0016.b1 cpd1c.pk004.b20cr1n.pk0083.c5 csi1.pk0003.g6 p0126.cnlcb49r rls6.pk0068.b7:fis CGS161.00 163.00

[0111]FIG. 5 presents an alignment of the amino acid sequences set forthin SEQ ID NOs:31, 62, and 64 with the Citrullus lanatus sequence (NCBIGeneral Identifier No. 540497; SEQ ID NO:32), Spinacia oleracea (NCBIGeneral Identifier No. 416869; SEQ ID NO:65), and the Solanum tuberosumsequence (NCBI General Identifier No. 11131628; SEQ ID NO:66). The datain Table 14 presents a calculation of the percent identity of the aminoacid sequences set forth in SEQ ID NOs:31, 62, and 64 with the Citrulluslanatus sequence (NCBI General Identifier No. 540497; SEQ ID NO:32),Spinacia oleracea (NCBI General Identifier No. 416869; SEQ ID NO:65),and the Solanum tuberosum sequence (NCBI General Identifier No.11131628; SEQ ID NO:66). TABLE 14 Percent Identity of Amino AcidSequences Deduced From the Nucleotide Sequences of cDNA Clones EncodingPolypeptides Homologous to Cysteine γ Synthase Percent Identity to Aminoacid NCBI NCBI NCBI Clone SEQ ID NO. GI 540497 GI 416869 GI 11131628se3.05h06 31 87.1 72.3 76.9 Contig of: 62 73.8 71.3 69.7 cco1n.pk083.j4chp2.pk0016.b1 cpd1c.pk004.b20 cr1n.pk0083.c5 csi1.pk0003.g6p0126.cnlcb49r rls6.pk0068.b7:fis 64 73.2 72.6 72.8

[0112] Sequence alignments and percent identity calculations wereperformed using the Megalign program of the LASERGENE bioinformaticscomputing suite (DNASTAR Inc., Madison, Wis.). Multiple alignment of thesequences was performed using the Clustal method of alignment (Higginsand Sharp (1989) CABIOS. 5:151-153) with the default parameters (GAPPENALTY=10, GAP LENGTH PENALTY=10). Default parameters for pairwisealignments using the Clustal method were KTUPLE 1, GAP PENALTY=3,WINDOW=5 and DIAGONALS SAVED=5. Sequence alignments and BLAST scores andprobabilities indicate that the nucleic acid fragments comprising theinstant cDNA clones encode entire corn, rice, and soybean cysteine γsynthases. These sequences represent the first corn, rice, and soybeansequences encoding cysteine γ synthase known to Applicant.

Example 7 Characterization of cDNA Clones Encoding Cystathione β-Lyase

[0113] The BLASTX search using the EST sequences from clones listed inTable 15 revealed similarity of the polypeptides encoded by the cDNAs tocystathionine β-lyase from Arabidopsis thaliana (GenBank Accession No.L40511; NCBI General Identifier No. 1708993). Shown in Table 15 are theBLAST results for individual ESTs (“EST”), the sequences of the entirecDNA inserts comprising the indicated cDNA clones (“FIS”), or thesequences of FISs encoding the entire protein (“CGS”): TABLE 15 BLASTResults for Sequences Encoding Polypeptides Homologous to Cystathioneβ-Lyase BLAST pLog Score Clone Status 1708993 (A. thaliana)cen1.pk0061.d4 FIS 50.41 r1r12.pk0026.g1 EST 39.00 sfl1.pk0012.c4 CGS33.85 wr1.pk0091.g6 EST 52.52

[0114] The sequence of the entire cDNA insert in the clone wr1.pk0091.g6was determined, RACE PCR was used to obtain the 5′ portion of the ricecystathionine β-lyase, and further sequencing and searching of theDuPont proprietary database allowed the identification of other corn andwheat clones encoding cystathionine β-lyase. The BLASTX search using theEST sequences from clones listed in Table 16 revealed similarity of thepolypeptides encoded by the cDNAs to cystathionine β-lyase fromArabidopsis thaliana (GenBank Accession No. L40511; NCBI GeneralIdentifier No. 1708993). Shown in Table 16 are the BLAST results for thesequences of the entire cDNA inserts comprising the indicated cDNAclones (“FIS”), or the sequences encoding the entire protein derivedfrom contigs assembled from the sequences of more than two ESTs, thesequence of contigs assembled from the entire cDNA inserts comprisingthe indicated cDNA clones and 5′ RACE PCR or an EST (“Contig*”): TABLE16 BLAST Results for Sequences Encoding Polypeptides Homologous toCystathione β-Lyase BLAST pLog Score Clone Status 1708993 Contig of:Contig* >180.00 cen1.pk0061.d4 p0005.cbmei71r p0014.ctuui39rp0109.cdadg47r p0125.czaay16r 5’RACE PCR + Contig* 178.00rlr12.pk0026.g1:fis wr1.pk0091.g6:fis FIS 177.00

[0115]FIG. 6 presents an alignment of the amino acid sequences set forthin SEQ ID NOs:34, 36, 38, 40, 68, 70, and 72 with the Arabidopsisthaliana sequence (NCBI General Identifier No. 1708993; SEQ ID NO:41).The data in Table 17 presents a calculation of the percent identity ofthe amino acid sequences set forth in SEQ ID NOs:34, 36, 38, 40, 68, 70,and 72 with the Arabidopsis thaliana sequence (NCBI General IdentifierNo. 1708993; SEQ ID NO:41). TABLE 17 Percent Identity of Amino AcidSequences Deduced From the Nucleotide Sequences of cDNA Clones EncodingPolypeptides Homologous to Cystathione β-Lyase Percent Identity to CloneSEQ ID NO. 1708993 (Arabidopsis thaliana) cen1.pk0061.d4 34 83.0rlr12.pk0026.gl 36 76.0 sf11.pk0012.c4 38 72.2 wr1.pk0091.g6 40 71.8Contig of: 68 66.8 cen1.pk0061.d4 p0005.cbmei71r p0014.ctuui39rp0109.cdadg47r p0125.czaay16r 5’RACE PCR + 70 66.2 rlr12.pk0026.gl:fiswr1.pk0091.g6:fis 72 66.2

[0116] The amino acid sequence set forth in SEQ ID NO:34 is identical toamino acids 248 through 470 of the amino acid sequence set forth in SEQID NO:68. The amino acid sequence set forth in SEQ ID NO:36 is identicalto amino acids 152 through 226 of the amino acid sequence set forth inSEQ ID NO:70. The amino acid sequence set forth in SEQ ID NO:40 isidentical to amino acids 3 through 133 of the amino acid sequence setforth in SEQ ID NO:72.

[0117] Sequence alignments and percent identity calculations wereperformed using the Megalign program of the LASERGENE bioinformaticscomputing suite (DNASTAR Inc., Madison, Wis.). Multiple alignment of thesequences was performed using the Clustal method of alignment (Higginsand Sharp (1989) CABIOS. 5:151-153) with the default parameters (GAPPENALTY=10, GAP LENGTH PENALTY=10). Default parameters for pairwisealignments using the Clustal method were KTUPLE 1, GAP PENALTY=3,WINDOW=5 and DIAGONALS SAVED=5. Sequence alignments and BLAST scores andprobabilities indicate that the nucleic acid fragments comprising theinstant cDNA clones encode an entire soybean cystathionine β-lyase, asubstantial portion and an entire corn and rice cystathionine β-lyases,a portion and a substantial portion of a wheat cystathionine β-lyase.

Example 8 Expression of Chimeric Genes in Monocot Cells

[0118] A chimeric gene comprising a cDNA encoding the instantpolypeptides in sense orientation with respect to the maize 27 kD zeinpromoter that is located 5′ to the cDNA fragment, and the 10 kD zein 3′end that is located 3′ to the cDNA fragment, can be constructed. ThecDNA fragment of this gene may be generated by polymerase chain reaction(PCR) of the cDNA clone using appropriate oligonucleotide primers.Cloning sites (Nco I or Sma I) can be incorporated into theoligonucleotides to provide proper orientation of the DNA fragment wheninserted into the digested vector pML 103 as described below.Amplification is then performed in a standard PCR. The amplified DNA isthen digested with restriction enzymes Nco I and Sma I and fractionatedon an agarose gel. The appropriate band can be isolated from the gel andcombined with a 4.9 kb Nco I-Sma I fragment of the plasmid pML103.Plasmid pML103 has been deposited under the terms of the Budapest Treatyat ATCC (American Type Culture Collection, 10801 University Blvd.,Manassas, Va. 20110-2209), and bears accession number ATCC 97366. TheDNA segment from pML103 contains a 1.05 kb Sal I-Nco I promoter fragmentof the maize 27 kD zein gene and a 0.96 kb Sma I-Sal I fragment from the3′ end of the maize 10 kD zein gene in the vector pGem9Zf(+) (Promega).Vector and insert DNA can be ligated at 15° C. overnight, essentially asdescribed (Maniatis). The ligated DNA may then be used to transform E.coli XL1-Blue (Epicurian Coli XL-1 Blue™; Stratagene). Bacterialtransformants can be screened by restriction enzyme digestion of plasmidDNA and limited nucleotide sequence analysis using the dideoxy chaintermination method (Sequenase™ DNA Sequencing Kit; U.S. Biochemical).The resulting plasmid construct would comprise a chimeric gene encoding,in the 5′ to 3′ direction, the maize 27 kD zein promoter, a cDNAfragment encoding the instant polypeptides, and the 10 kD zein 3′region.

[0119] The chimeric gene described above can then be introduced intocorn cells by the following procedure. Immature corn embryos can bedissected from developing caryopses derived from crosses of the inbredcorn lines H99 and LH132. The embryos are isolated 10 to 11 days afterpollination when they are 1.0 to 1.5 mm long. The embryos are thenplaced with the axis-side facing down and in contact withagarose-solidified N6 medium (Chu et al. (1975) Sci. Sin. Peking18:659-668). The embryos are kept in the dark at 27° C. Friableembryogenic callus consisting of undifferentiated masses of cells withsomatic proembryoids and embryoids borne on suspensor structuresproliferates from the scutellum of these immature embryos. Theembryogenic callus isolated from the primary explant can be cultured onN6 medium and sub-cultured on this medium every 2 to 3 weeks.

[0120] The plasmid, p35S/Ac (obtained from Dr. Peter Eckes, Hoechst Ag,Frankfurt, Germany) may be used in transformation experiments in orderto provide for a selectable marker. This plasmid contains the Pat gene(see European Patent Publication 0 242 236) which encodesphosphinothricin acetyl transferase (PAT). The enzyme PAT confersresistance to herbicidal glutamine synthetase inhibitors such asphosphinothricin. The pat gene in p35S/Ac is under the control of the35S promoter from Cauliflower Mosaic Virus (Odell et al. (1985) Nature313:810-812) and the 3′ region of the nopaline synthase gene from theT-DNA of the Ti plasmid of Agrobacterium tumefaciens.

[0121] The particle bombardment method (Klein et al. (1987) Nature327:70-73) may be used to transfer genes to the callus culture cells.According to this method, gold particles (1 μm in diameter) are coatedwith DNA using the following technique. Ten μg of plasmid DNAs are addedto 50 μL of a suspension of gold particles (60 mg per mL). Calciumchloride (50 μL of a 2.5 M solution) and spermidine free base (20 μL ofa 1.0 M solution) are added to the particles. The suspension is vortexedduring the addition of these solutions. After 10 minutes, the tubes arebriefly centrifuged (5 sec at 15,000 rpm) and the supernatant removed.The particles are resuspended in 200 μL of absolute ethanol, centrifugedagain and the supernatant removed. The ethanol rinse is performed againand the particles resuspended in a final volume of 30 μL of ethanol. Analiquot (5 μL) of the DNA-coated gold particles can be placed in thecenter of a Kapton™ flying disc (Bio-Rad Labs). The particles are thenaccelerated into the corn tissue with a Biolistic™ PDS-1000/He (Bio-RadInstruments, Hercules Calif.), using a helium pressure of 1000 psi, agap distance of 0.5 cm and a flying distance of 1.0 cm.

[0122] For bombardment, the embryogenic tissue is placed on filter paperover agarose-solidified N6 medium. The tissue is arranged as a thin lawnand covered a circular area of about 5 cm in diameter. The petri dishcontaining the tissue can be placed in the chamber of the PDS-1000/Heapproximately 8 cm from the stopping screen. The air in the chamber isthen evacuated to a vacuum of 28 inches of Hg. The macrocarrier isaccelerated with a helium shock wave using a rupture membrane thatbursts when the He pressure in the shock tube reaches 1000 psi.

[0123] Seven days after bombardment the tissue can be transferred to N6medium that contains gluphosinate (2 mg per liter) and lacks casein orproline. The tissue continues to grow slowly on this medium. After anadditional 2 weeks the tissue can be transferred to fresh N6 mediumcontaining gluphosinate. After 6 weeks, areas of about 1 cm in diameterof actively growing callus can be identified on some of the platescontaining the glufosinate-supplemented medium. These calli may continueto grow when sub-cultured on the selective medium.

[0124] Plants can be regenerated from the transgenic callus by firsttransferring clusters of tissue to N6 medium supplemented with 0.2 mgper liter of 2,4-D. After two weeks the tissue can be transferred toregeneration medium (Fromm et al. (1990) Bio/Technology 8:833-839).

Example 9 Expression of Chimeric Genes in Dicot Cells

[0125] A seed-specific expression cassette composed of the promoter andtranscription terminator from the gene encoding the β subunit of theseed storage protein phaseolin from the bean Phaseolus vulgaris (Doyleet al. (1986) J. Biol. Chem. 261:9228-9238) can be used for expressionof the instant polypeptides in transformed soybean. The phaseolincassette includes about 500 nucleotides upstream (5′) from thetranslation initiation codon and about 1650 nucleotides downstream (3′)from the translation stop codon of phaseolin. Between the 5′ and 3′regions are the unique restriction endonuclease sites Nco I (whichincludes the ATG translation initiation codon), Sma I, Kpn I and Xba I.The entire cassette is flanked by Hind III sites.

[0126] The cDNA fragment of this gene may be generated by polymerasechain reaction (PCR) of the cDNA clone using appropriate oligonucleotideprimers. Cloning sites can be incorporated into the oligonucleotides toprovide proper orientation of the DNA fragment when inserted into theexpression vector. Amplification is then performed as described above,and the isolated fragment is inserted into a pUC 18 vector carrying theseed expression cassette.

[0127] Soybean embryos may then be transformed with the expressionvector comprising sequences encoding the instant polypeptides. To inducesomatic embryos, cotyledons, 3-5 mm in length dissected from surfacesterilized, immature seeds of the soybean cultivar A2872, can becultured in the light or dark at 26° C. on an appropriate agar mediumfor 6-10 weeks. Somatic embryos which produce secondary embryos are thenexcised and placed into a suitable liquid medium. After repeatedselection for clusters of somatic embryos which multiplied as early,globular staged embryos, the suspensions are maintained as describedbelow.

[0128] Soybean embryogenic suspension cultures can be maintained in 35mL liquid media on a rotary shaker, 150 rpm, at 26° C. with florescentlights on a 16:8 hour day/night schedule. Cultures are subcultured everytwo weeks by inoculating approximately 35 mg of tissue into 35 mL ofliquid medium.

[0129] Soybean embryogenic suspension cultures may then be transformedby the method of particle gun bombardment (Klein et al. (1987) Nature(London) 327:70-73, U.S. Pat. No. 4,945,050). A DuPont Biolistic™ PDS1000/HE instrument (helium retrofit) can be used for thesetransformations.

[0130] A selectable marker gene which can be used to facilitate soybeantransformation is a chimeric gene composed of the 35S promoter fromCauliflower Mosaic Virus (Odell et al. (1985) Nature 313:810-812), thehygromycin phosphotransferase gene from plasmid pJR225 (from E. coli;Gritz et al.(1983) Gene 25:179-188) and the 3′ region of the nopalinesynthase gene from the T-DNA of the Ti plasmid of Agrobacteriumtumefaciens. The seed expression cassette comprising the phaseolin 5′region, the fragment encoding the instant polypeptides and the phaseolin3′ region can be isolated as a restriction fragment. This fragment canthen be inserted into a unique restriction site of the vector carryingthe marker gene.

[0131] To 50 μL of a 60 mg/mL 1 μm gold particle suspension is added (inorder): 5 μL DNA (1 μg/μL), 20 μL spermidine (0.1 M), and 50 μL CaCl₂(2.5 M). The particle preparation is then agitated for three minutes,spun in a microfuge for 10 seconds and the supernatant removed. TheDNA-coated particles are then washed once in 400 μL 70% ethanol andresuspended in 40 μL of anhydrous ethanol. The DNA/particle suspensioncan be sonicated three times for one second each. Five μL of theDNA-coated gold particles are then loaded on each macro carrier disk.

[0132] Approximately 300-400 mg of a two-week-old suspension culture isplaced in an empty 60×15 mm petri dish and the residual liquid removedfrom the tissue with a pipette. For each transformation experiment,approximately 5-10 plates of tissue are normally bombarded. Membranerupture pressure is set at 1100 psi and the chamber is evacuated to avacuum of 28 inches mercury. The tissue is placed approximately 3.5inches away from the retaining screen and bombarded three times.Following bombardment, the tissue can be divided in half and placed backinto liquid and cultured as described above.

[0133] Five to seven days post bombardment, the liquid media may beexchanged with fresh media, and eleven to twelve days post bombardmentwith fresh media containing 50 mg/mL hygromycin. This selective mediacan be refreshed weekly. Seven to eight weeks post bombardment, green,transformed tissue may be observed growing from untransformed, necroticembryogenic clusters. Isolated green tissue is removed and inoculatedinto individual flasks to generate new, clonally propagated, transformedembryogenic suspension cultures. Each new line may be treated as anindependent transformation event. These suspensions can then besubcultured and maintained as clusters of immature embryos orregenerated into whole plants by maturation and germination ofindividual somatic embryos.

Example 10 Expression of Chimeric Genes in Microbial Cells

[0134] The cDNAs encoding the instant polypeptides can be inserted intothe T7 E. coli expression vector pBT430. This vector is a derivative ofpET-3a (Rosenberg et al. (1987) Gene 56:125-135) which employs thebacteriophage T7 RNA polymerase/T7 promoter system. Plasmid pBT430 wasconstructed by first destroying the EcoR I and Hind III sites in pET-3aat their original positions. An oligonucleotide adaptor containing EcoRI and Hind III sites was inserted at the BamH I site of pET-3a. Thiscreated pET-3aM with additional unique cloning sites for insertion ofgenes into the expression vector. Then, the Nde I site at the positionof translation initiation was converted to an Nco I site usingoligonucleotide-directed mutagenesis. The DNA sequence of pET-3aM inthis region, 5′-CATATGG, was converted to 5′-CCCATGG in pBT430.

[0135] Plasmid DNA containing a cDNA may be appropriately digested torelease a nucleic acid fragment encoding the protein. This fragment maythen be purified on a 1% low melting agarose gel. Buffer and agarosecontain 10 μg/ml ethidium bromide for visualization of the DNA fragment.The fragment can then be purified from the agarose gel by digestion withGELase™ (Epicentre Technologies, Madison, Wis.) according to themanufacturer's instructions, ethanol precipitated, dried and resuspendedin 20 μL of water. Appropriate oligonucleotide adapters may be ligatedto the fragment using T4 DNA ligase (New England Biolabs (NEB), Beverly,Mass.). The fragment containing the ligated adapters can be purifiedfrom the excess adapters using low melting agarose as described above.The vector pBT430 is digested, dephosphorylated with alkalinephosphatase (NEB) and deproteinized with phenol/chloroform as describedabove. The prepared vector pBT430 and fragment can then be ligated at16° C. for 15 hours followed by transformation into DH5 electrocompetentcells (GIBCO BRL). Transformants can be selected on agar platescontaining LB media and 100 μg/mL ampicillin. Transformants containingthe gene encoding the instant polypeptides are then screened for thecorrect orientation with respect to the T7 promoter by restrictionenzyme analysis.

[0136] For high level expression, a plasmid clone with the cDNA insertin the correct orientation relative to the T7 promoter can betransformed into E. coli strain BL21 (DE3) (Studier et al. (1986) J.Mol. Biol. 189:113-130). Cultures are grown in LB medium containingampicillin (100 mg/L) at 25° C. At an optical density at 600 nm ofapproximately 1, IPTG (isopropylthio-β-galactoside, the inducer) can beadded to a final concentration of 0.4 mM and incubation can be continuedfor 3 h at 25°. Cells are then harvested by centrifugation andre-suspended in 50 μL of 50 mM Tris-HCl at pH 8.0 containing 0.1 mM DTTand 0.2 mM phenyl methylsulfonyl fluoride. A small amount of 1 mm glassbeads can be added and the mixture sonicated 3 times for about 5 secondseach time with a microprobe sonicator. The mixture is centrifuged andthe protein concentration of the supernatant determined. One μg ofprotein from the soluble fraction of the culture can be separated bySDS-polyacrylamide gel electrophoresis. Gels can be observed for proteinbands migrating at the expected molecular weight.

Example 11 Evaluating Compounds for Their Ability to Inhibit theActivity of Plant Biosynthetic Enzymes

[0137] The polypeptides described herein may be produced using anynumber of methods known to those skilled in the art. Such methodsinclude, but are not limited to, expression in bacteria as described inExample 10, or expression in eukaryotic cell culture, in planta, andusing viral expression systems in suitably infected organisms or celllines. The instant polypeptides may be expressed either as mature formsof the proteins as observed in vivo or as fusion proteins by covalentattachment to a variety of enzymes, proteins or affinity tags. Commonfusion protein partners include glutathione S-transferase (“GST”),thioredoxin (“Trx”), maltose binding protein, and C- and/or N-terminalhexahistidine polypeptide (“(His)₆”). The fusion proteins may beengineered with a protease recognition site at the fusion point so thatfusion partners can be separated by protease digestion to yield intactmature enzyme. Examples of such proteases include thrombin, enterokinaseand factor Xa. However, any protease can be used which specificallycleaves the peptide connecting the fusion protein and the enzyme.

[0138] Purification of the instant polypeptides, if desired, may utilizeany number of separation technologies familiar to those skilled in theart of protein purification. Examples of such methods include, but arenot limited to, homogenization, filtration, centrifugation, heatdenaturation, ammonium sulfate precipitation, desalting, pHprecipitation, ion exchange chromatography, hydrophobic interactionchromatography and affinity chromatography, wherein the affinity ligandrepresents a substrate, substrate analog or inhibitor. When the instantpolypeptides are expressed as fusion proteins, the purification protocolmay include the use of an affinity resin which is specific for thefusion protein tag attached to the expressed enzyme or an affinity resincontaining ligands which are specific for the enzyme. For example, theinstant polypeptides may be expressed as a fusion protein coupled to theC-terminus of thioredoxin. In addition, a (His)₆ peptide may beengineered into the N-terminus of the fused thioredoxin moiety to affordadditional opportunities for affinity purification. Other suitableaffinity resins could be synthesized by linking the appropriate ligandsto any suitable resin such as Sepharose-4B. In an alternate embodiment,a thioredoxin fusion protein may be eluted using dithiothreitol;however, elution may be accomplished using other reagents which interactto displace the thioredoxin from the resin. These reagents includeβ-mercaptoethanol or other reduced thiol. The eluted fusion protein maybe subjected to further purification by traditional means as statedabove, if desired. Proteolytic cleavage of the thioredoxin fusionprotein and the enzyme may be accomplished after the fusion protein ispurified or while the protein is still bound to the ThioBond™ affinityresin or other resin.

[0139] Crude, partially purified or purified enzyme, either alone or asa fusion protein, may be utilized in assays for the evaluation ofcompounds for their ability to inhibit enzymatic activation of theinstant polypeptides disclosed herein. Assays may be conducted underwell known experimental conditions which permit optimal enzymaticactivity. Examples of assays for many of these enzymes can be found inMethods in Enzymology Vol. V, (Colowick and Kaplan eds.) Academic Press,New York or Methods in Enzymology Vol. XVII, (Tabor and Tabor eds.)Academic Press, New York. Specific examples may be found in thefollowing references, each of which is incorporated herein by reference:aspartic semialdehyde dehydrogenase may be assayed as described in Blacket al. (1955) J. Biol. Chem. 213:39-50, or Cremer et al. (1988) J. Gen.Microbiol. 134:3221-3229; diaminopimelate decarboxylase may be assayedas described in Work (1962) in Methods in Enzymology Vol. V, (Colowickand Kaplan eds.) 864-870, Academic Press, New York or Cremer et al.(1988) J. Gen. Microbiol. 134:3221-3229; homoserine kinase may beassayed as described in Aarnes (1976) Plant Sci. Lett. 7:187-194;cysteine synthase may be assayed as described in Thompson et al. (1968)Biochem. Biophys. Res. Commun. 31: 281-286 or Bertagnolli et al. (1977)Plant Physiol. 60:115-121; and cystathionine β-lyase may be assayed asdescribed in Giovanelli et al. (1971) Biochim. Biophys. Acta 227:654-670or Droux et al. (1995) Arch. Biochem Biophys. 316:585-595.

1 72 1 826 DNA Oryza sativa 1 tggtaccgcc acgccaaggt ggtaaggatggttgtcagca cttaccaagc agcaagtggt 60 gctggggctg cggccatgga agaactcaaacttcaaactc aagaggtctt ggcggggaaa 120 gcaccaacat gcaacatttt cagtcagcagtatgctttta atatattttc acataatgca 180 ccaattgttg aaaatgggta caatgaggaggagatgaaga tggtgaagga gaccagaaaa 240 atctggaatg ataaagatgt gaaggtaactgcaacctgca tacgagttcc tgtgatgcgt 300 gcacatgctg aaagtgtgaa tctacagtttgaaaagccac ttgatgagga tactgcaagg 360 gaaatcttga gggcagctga aggtgttaccattattgatg accgtgcttc caatcgcttc 420 cccacacctc ttgaggtatc ggataaagatgatgtagcag tgggtagaat tcgtcaggat 480 ttgtcgcaag atgataacaa agggctggacatatttgttt gtggagatca aatacgtaaa 540 ggtgctgcac tcaatgctgt gcagattgctgaaatgctac tcaagtgatt ttcttttctg 600 tacctttctc tccttgcccc tctttgctctagtcattgtt tgacggatgt actctggtta 660 gtatgagatc aattttgatc atcttttgtaatctatattc ctagtgaaat aaatgtaaaa 720 cggttttgct ctatcttctg cacaagtgtagaagaaatct gaaattggga aattggagtg 780 tggcccttgt tcaaaaaaaa aaaaaaaaaaaaaaaaaaaa aaaaaa 826 2 195 PRT Oryza sativa 2 Trp Tyr Arg His Ala LysVal Val Arg Met Val Val Ser Thr Tyr Gln 1 5 10 15 Ala Ala Ser Gly AlaGly Ala Ala Ala Met Glu Glu Leu Lys Leu Gln 20 25 30 Thr Gln Glu Val LeuAla Gly Lys Ala Pro Thr Cys Asn Ile Phe Ser 35 40 45 Gln Gln Tyr Ala PheAsn Ile Phe Ser His Asn Ala Pro Ile Val Glu 50 55 60 Asn Gly Tyr Asn GluGlu Glu Met Lys Met Val Lys Glu Thr Arg Lys 65 70 75 80 Ile Trp Asn AspLys Asp Val Lys Val Thr Ala Thr Cys Ile Arg Val 85 90 95 Pro Val Met ArgAla His Ala Glu Ser Val Asn Leu Gln Phe Glu Lys 100 105 110 Pro Leu AspGlu Asp Thr Ala Arg Glu Ile Leu Arg Ala Ala Glu Gly 115 120 125 Val ThrIle Ile Asp Asp Arg Ala Ser Asn Arg Phe Pro Thr Pro Leu 130 135 140 GluVal Ser Asp Lys Asp Asp Val Ala Val Gly Arg Ile Arg Gln Asp 145 150 155160 Leu Ser Gln Asp Asp Asn Lys Gly Leu Asp Ile Phe Val Cys Gly Asp 165170 175 Gln Ile Arg Lys Gly Ala Ala Leu Asn Ala Val Gln Ile Ala Glu Met180 185 190 Leu Leu Lys 195 3 875 DNA Triticum aestivum 3 cctcatggctgtcacgccgc tgcatcgcca cgccaaggtg aaaaggatgg ttgtcagcac 60 ataccaagcagcaagtggtg ctggtgctgc agccatggaa gaactcaaac ttcagactcg 120 agaggtcttggaaggaaagc caccaacctg taacattttc agtcaacagt atgcttttaa 180 tatattttcgcataatgcac ctattgttga aaatggctat aatgaggaag agatgaaaat 240 ggtgaaggagaccagaaaaa tctggaatga caaggatgta agagtaactg caacttgtat 300 acgggttcctacgatgcgcg cgcatgccga aagcgtgaat ctacagtttg aaaagccact 360 tgatgaggacactgccagag aaatcttgag ggcagctcct ggtgttacca ttagtgacga 420 ccgtgctgccaaccgcttcc ctacaccact ggaggtatcg gataaagatg acgtatcagt 480 tggtaggattcgccaggact tgtcacaaga tgataacaga gggttggagt tatttgtctg 540 tggagaccagatacgtaaag gcgccgcgct gaacgctgtg cagattgctg aaatgctact 600 gaagtgaccgcctttttacc attgtctcat gtgccacgtt gctctatcca ttgatggatt 660 gatgtactctagtcactttc aacccagttt tggtcgtcgt cttttttgta atctgtcaac 720 ctagcagaagaagtgtaaga cgggctttag tcatctgttg cacacaaaag tgcagccaca 780 agtttagaaaaggagggttt tcacttgttc ggattttgcc ttaggttgga ctttgttgca 840 agttgtcgtttgtttcttga aagctggtct gctgt 875 4 201 PRT Triticum aestivum 4 Leu MetAla Val Thr Pro Leu His Arg His Ala Lys Val Lys Arg Met 1 5 10 15 ValVal Ser Thr Tyr Gln Ala Ala Ser Gly Ala Gly Ala Ala Ala Met 20 25 30 GluGlu Leu Lys Leu Gln Thr Arg Glu Val Leu Glu Gly Lys Pro Pro 35 40 45 ThrCys Asn Ile Phe Ser Gln Gln Tyr Ala Phe Asn Ile Phe Ser His 50 55 60 AsnAla Pro Ile Val Glu Asn Gly Tyr Asn Glu Glu Glu Met Lys Met 65 70 75 80Val Lys Glu Thr Arg Lys Ile Trp Asn Asp Lys Asp Val Arg Val Thr 85 90 95Ala Thr Cys Ile Arg Val Pro Thr Met Arg Ala His Ala Glu Ser Val 100 105110 Asn Leu Gln Phe Glu Lys Pro Leu Asp Glu Asp Thr Ala Arg Glu Ile 115120 125 Leu Arg Ala Ala Pro Gly Val Thr Ile Ser Asp Asp Arg Ala Ala Asn130 135 140 Arg Phe Pro Thr Pro Leu Glu Val Ser Asp Lys Asp Asp Val SerVal 145 150 155 160 Gly Arg Ile Arg Gln Asp Leu Ser Gln Asp Asp Asn ArgGly Leu Glu 165 170 175 Leu Phe Val Cys Gly Asp Gln Ile Arg Lys Gly AlaAla Leu Asn Ala 180 185 190 Val Gln Ile Ala Glu Met Leu Leu Lys 195 2005 457 DNA Glycine max unsure (211) n = A, C, G or T 5 gtctgttttaaaatccaaca cttaatctct ctcttcgcag cctaaaatcc caatggcttc 60 actctctgttttgcgccaca accacctctt ctcgggcccc ctcccggccc gccccaagcc 120 cacctcctcctcctcctcca ggatccgaat gtccctccgc gagaacggcc cctccatcgc 180 cgtcgtgggcgtcaccggcg ccgtcggcca ngagttcctc tccgtcctct ccgaccgcga 240 cttcccctaccgctccattc atatgctggc ttccaagcgc tccgctggac gccgcatcac 300 cttcgaggacagggactacn tcttcaggag ctcacgccgg agagttcgac ggtgtcgaca 360 tcgcgctcttcagcgcnggg ggtccatcaa nnaagcattc ggaccatcgn cgtaaatcgn 420 gggacggncgtngncaanat anctccggtt ncctttg 457 6 86 PRT Glycine max 6 Met Ala Ser LeuSer Val Leu Arg His Asn His Leu Phe Ser Gly Pro 1 5 10 15 Leu Pro AlaArg Pro Lys Pro Thr Ser Ser Ser Ser Ser Arg Ile Arg 20 25 30 Met Ser LeuArg Glu Asn Gly Pro Ser Ile Ala Val Val Gly Val Thr 35 40 45 Gly Ala ValGly Gln Glu Phe Leu Ser Val Leu Ser Asp Arg Asp Phe 50 55 60 Pro Tyr ArgSer Ile His Met Leu Ala Ser Lys Arg Ser Ala Gly Arg 65 70 75 80 Arg IleThr Phe Glu Asp 85 7 160 PRT Legionella pneumophila 7 Met Ser Arg HisLeu Asn Val Ala Ile Val Gly Ala Thr Gly Ala Val 1 5 10 15 Gly Glu ThrPhe Leu Thr Val Leu Glu Glu Arg Asn Phe Pro Ile Lys 20 25 30 Ser Leu TyrPro Leu Ala Ser Ser Arg Ser Val Gly Lys Thr Val Thr 35 40 45 Phe Arg AspGln Glu Leu Asp Val Leu Asp Leu Ala Glu Phe Asp Phe 50 55 60 Ser Lys ValAsp Leu Ala Leu Phe Ser Ala Gly Gly Ala Val Ser Lys 65 70 75 80 Glu TyrAla Pro Lys Ala Val Ala Ala Gly Cys Val Val Val Asp Asn 85 90 95 Thr SerCys Phe Arg Tyr Glu Asp Asp Ile Pro Leu Val Val Pro Gly 100 105 110 SerGlu Ser Ser Ser Asn Arg Asp Tyr Thr Lys Arg Gly Ile Ile Ala 115 120 125Asn Pro Asn Cys Ser Thr Ile Gln Met Val Val Ala Leu Lys Pro Ile 130 135140 Tyr Asp Ala Val Gly Ile Ser Arg Ile Asn Val Ala Thr Tyr Gln Ser 145150 155 160 8 1054 DNA Zea mays 8 atttaacgga aatgggaaga cactcgaacatcttaaatta gctgctgaga gtggagtatt 60 tgtaaatgtg gatagcgaat ttgatttggagaatattgtc agagctgcaa gagctactgg 120 aaagaaagtg cctgttttgc ttcgaataaatccagatgtg gatccgcagg tacatcctta 180 tgttgccacg ggaaataaaa cgtctaaatttgggatccgc aatgagaaat tgcaatggtt 240 tttggactct atcaagtcat acccgaatgaaatcaaactc gttggtgttc attgccatct 300 gggatctact attacaaagg ttgatatattcagagatgct gcagttctta tgctgaatta 360 tgtcgatgaa attcgagcac aaggttttaagttggagtac ctgaatatcg gaggtggttt 420 gggaatagat taccatcata ccgatgcagtcttacctaca cctatggatc tcatcaacac 480 tgtgcgagaa ttagttctct ctcaagatctcactcttatt attgaacccg gaagatcctt 540 gattgctaat acttgctgct tcgtcaatagagtaactggt gttaaatcta atggtacaaa 600 gaatttcatt gttgttgatg gcagcatggcagaactcatc agacctagtc tgtatggagc 660 ataccagcat atcgaactgg tctctccccccactcctggt gctgaagcag cgaccttcga 720 tattgttgga ccagtttgtg agtctgcagatttccttgga aaagataggg aacttccaac 780 acctgatgag ggagctggac tggttgttcatgatgcaggt gcctactgca tgagcatggc 840 ttccacctac aacctgaagt tgaggccaccggaatactgg gtggaagcgg acggttcgat 900 cgttaagatc aggcatggag agaagcttgatgactacatg aagttctttg atggtcttcc 960 tgcttagatg tttattatct gcgactgctacggacgatgt tttcttgggg ataattggat 1020 tttctttgtc aaaaaaaaaa aaaaaaaaaaaaaa 1054 9 321 PRT Zea mays 9 Phe Asn Gly Asn Gly Lys Thr Leu Glu HisLeu Lys Leu Ala Ala Glu 1 5 10 15 Ser Gly Val Phe Val Asn Val Asp SerGlu Phe Asp Leu Glu Asn Ile 20 25 30 Val Arg Ala Ala Arg Ala Thr Gly LysLys Val Pro Val Leu Leu Arg 35 40 45 Ile Asn Pro Asp Val Asp Pro Gln ValHis Pro Tyr Val Ala Thr Gly 50 55 60 Asn Lys Thr Ser Lys Phe Gly Ile ArgAsn Glu Lys Leu Gln Trp Phe 65 70 75 80 Leu Asp Ser Ile Lys Ser Tyr ProAsn Glu Ile Lys Leu Val Gly Val 85 90 95 His Cys His Leu Gly Ser Thr IleThr Lys Val Asp Ile Phe Arg Asp 100 105 110 Ala Ala Val Leu Met Leu AsnTyr Val Asp Glu Ile Arg Ala Gln Gly 115 120 125 Phe Lys Leu Glu Tyr LeuAsn Ile Gly Gly Gly Leu Gly Ile Asp Tyr 130 135 140 His His Thr Asp AlaVal Leu Pro Thr Pro Met Asp Leu Ile Asn Thr 145 150 155 160 Val Arg GluLeu Val Leu Ser Gln Asp Leu Thr Leu Ile Ile Glu Pro 165 170 175 Gly ArgSer Leu Ile Ala Asn Thr Cys Cys Phe Val Asn Arg Val Thr 180 185 190 GlyVal Lys Ser Asn Gly Thr Lys Asn Phe Ile Val Val Asp Gly Ser 195 200 205Met Ala Glu Leu Ile Arg Pro Ser Leu Tyr Gly Ala Tyr Gln His Ile 210 215220 Glu Leu Val Ser Pro Pro Thr Pro Gly Ala Glu Ala Ala Thr Phe Asp 225230 235 240 Ile Val Gly Pro Val Cys Glu Ser Ala Asp Phe Leu Gly Lys AspArg 245 250 255 Glu Leu Pro Thr Pro Asp Glu Gly Ala Gly Leu Val Val HisAsp Ala 260 265 270 Gly Ala Tyr Cys Met Ser Met Ala Ser Thr Tyr Asn LeuLys Leu Arg 275 280 285 Pro Pro Glu Tyr Trp Val Glu Ala Asp Gly Ser IleVal Lys Ile Arg 290 295 300 His Gly Glu Lys Leu Asp Asp Tyr Met Lys PhePhe Asp Gly Leu Pro 305 310 315 320 Ala 10 1813 DNA Zea mays 10cgcttcctgg aaggctggaa cagaaagaac cctaaaccct agcaatggcg gcggcgaacc 60tgctgtcgcg ctcccttctc cccaccccaa acactatccg aacgagccac cccaccccgc 120ggagcccagc cgtcgtctcc ttcccccgcc gccgtgcccg cctgtccgtg tgcgcctccg 180tctccatggc ctccccgtcc ccaccgccac agcccgcggc ggccggcgtg ccgaagcact 240gcttccggcg cggcgccgac ggctacctgt actgcgaggg agtgagggtg gaagacgcga 300tggcggctgc cgagcgcagc cccttctatc tctacagcaa gcttcagatc ctccgcaact 360tcgccgctta ccgcgacgct ctccaggggc tccgctccat cgtcgggtat gccgtgaagg 420ccaacaataa cctccccgtg ctacgcgtcc tgcgtgagct tggctgcggc gccgtcctcg 480tcagcggcaa cgagctccga ctcgccctcc aggcgggatt cgaccccgcc aggtgtatat 540ttaacggaaa tgggaagaca ctcgaagatc ttaaattggc tgctgagagt ggagtatttg 600taaatgtgga tagtgaattt gatttagaga atattgtcag agctgcaaga gctactggaa 660agaaagtgcc tgttttactt agaataaatc cagatgtgga tccacaggta catccatatg 720ttgccacggg aaataaaaca tccaaattcg ggatccgcaa tgagaaattg caatggtttt 780tgaactctat caagtcatac tcgaatgaaa tcaaactcgt tggtgttcat tgccatctgg 840gatctactat tacaaaggtt gatatattca gagatgctgc agtgcttatg gtgaattatg 900tcgatgaaat tcgagcacaa ggttttaagt tggagtacct gaatattgga ggtggtttgg 960gaatagatta ccatcatacc gatgcagtct tacctacacc tatggatctc atcaacactg 1020tacgagaatt agttctctct caagatctta ctcttattat tgaacctgga agatccttga 1080ttgctaatac ttgctgcttc gtcaatagag taactggtgt taaatctaat ggtacaaaga 1140atttcattgt tgttgatggc agcatggcag aactcatcag acctagcctg tatggagcat 1200atcagcatat cgaattggtc tctcccccca ctcctggtgc tgaagtagcg accttcgata 1260ttgttgggcc agtttgtgag tctgcagatt tccttggaaa agatagggaa cttccaacac 1320ctgatgaggg agctggactg gttgttcatg atgcaggtgc ctactgcatg agcatggctt 1380ccacctacaa cctgaagttg aggccgccag agtactgggt tgaagaggat ggttcgattg 1440ttaagatcag gcatgaagag aagctcgatg actacatgaa gttctttgat ggtcttcctg 1500cttagatgtt tatttgtgac tgctaggggc gatgttttct tggagataat tgaatttttc 1560tttgtcaagc tcattttgct ttcttgtggt tgttatggaa tgttactgga tactggatag 1620ttagttcggc ctgtaggcgt atcctcctga acttacctct cattgctgtt agttttggca 1680ccaagtttgt tcccaattgc tatttacgga agttattgca taaagggctg tttggttgta 1740atcttcccgt aagaataaga tgcatgtttt tgagttaaaa aagggggggc ccggtaccca 1800attcgcccta tag 1813 11 486 PRT Zea mays 11 Met Ala Ala Ala Asn Leu LeuSer Arg Ser Leu Leu Pro Thr Pro Asn 1 5 10 15 Thr Ile Arg Thr Ser HisPro Thr Pro Arg Ser Pro Ala Val Val Ser 20 25 30 Phe Pro Arg Arg Arg AlaArg Leu Ser Val Cys Ala Ser Val Ser Met 35 40 45 Ala Ser Pro Ser Pro ProPro Gln Pro Ala Ala Ala Gly Val Pro Lys 50 55 60 His Cys Phe Arg Arg GlyAla Asp Gly Tyr Leu Tyr Cys Glu Gly Val 65 70 75 80 Arg Val Glu Asp AlaMet Ala Ala Ala Glu Arg Ser Pro Phe Tyr Leu 85 90 95 Tyr Ser Lys Leu GlnIle Leu Arg Asn Phe Ala Ala Tyr Arg Asp Ala 100 105 110 Leu Gln Gly LeuArg Ser Ile Val Gly Tyr Ala Val Lys Ala Asn Asn 115 120 125 Asn Leu ProVal Leu Arg Val Leu Arg Glu Leu Gly Cys Gly Ala Val 130 135 140 Leu ValSer Gly Asn Glu Leu Arg Leu Ala Leu Gln Ala Gly Phe Asp 145 150 155 160Pro Ala Arg Cys Ile Phe Asn Gly Asn Gly Lys Thr Leu Glu Asp Leu 165 170175 Lys Leu Ala Ala Glu Ser Gly Val Phe Val Asn Val Asp Ser Glu Phe 180185 190 Asp Leu Glu Asn Ile Val Arg Ala Ala Arg Ala Thr Gly Lys Lys Val195 200 205 Pro Val Leu Leu Arg Ile Asn Pro Asp Val Asp Pro Gln Val HisPro 210 215 220 Tyr Val Ala Thr Gly Asn Lys Thr Ser Lys Phe Gly Ile ArgAsn Glu 225 230 235 240 Lys Leu Gln Trp Phe Leu Asn Ser Ile Lys Ser TyrSer Asn Glu Ile 245 250 255 Lys Leu Val Gly Val His Cys His Leu Gly SerThr Ile Thr Lys Val 260 265 270 Asp Ile Phe Arg Asp Ala Ala Val Leu MetVal Asn Tyr Val Asp Glu 275 280 285 Ile Arg Ala Gln Gly Phe Lys Leu GluTyr Leu Asn Ile Gly Gly Gly 290 295 300 Leu Gly Ile Asp Tyr His His ThrAsp Ala Val Leu Pro Thr Pro Met 305 310 315 320 Asp Leu Ile Asn Thr ValArg Glu Leu Val Leu Ser Gln Asp Leu Thr 325 330 335 Leu Ile Ile Glu ProGly Arg Ser Leu Ile Ala Asn Thr Cys Cys Phe 340 345 350 Val Asn Arg ValThr Gly Val Lys Ser Asn Gly Thr Lys Asn Phe Ile 355 360 365 Val Val AspGly Ser Met Ala Glu Leu Ile Arg Pro Ser Leu Tyr Gly 370 375 380 Ala TyrGln His Ile Glu Leu Val Ser Pro Pro Thr Pro Gly Ala Glu 385 390 395 400Val Ala Thr Phe Asp Ile Val Gly Pro Val Cys Glu Ser Ala Asp Phe 405 410415 Leu Gly Lys Asp Arg Glu Leu Pro Thr Pro Asp Glu Gly Ala Gly Leu 420425 430 Val Val His Asp Ala Gly Ala Tyr Cys Met Ser Met Ala Ser Thr Tyr435 440 445 Asn Leu Lys Leu Arg Pro Pro Glu Tyr Trp Val Glu Glu Asp GlySer 450 455 460 Ile Val Lys Ile Arg His Glu Glu Lys Leu Asp Asp Tyr MetLys Phe 465 470 475 480 Phe Asp Gly Leu Pro Ala 485 12 1116 DNA Oryzasativa 12 cttacacgga gtgtttgtaa acatagacag tgaatttgat ttggagaatattgtcactgc 60 tgcgagagtt gctgggaaga aagtccctgt tttgctcagg ataaacccagatgtggatcc 120 acaggtccat ccttatgttg cgactggaaa caaaacctcc aaatttggtatccgtaatga 180 gaaactacaa tggttcttag actctatcaa gtcatactca aatgatatcacactggtggg 240 tgttcattgt catctgggat ctaccattac aaaggtcgat atatttagagatgcggcagg 300 tcttatggtg aattatgttg atgaaattcg agcacaaggt tttgaactggaatatctcaa 360 tattggcggt ggcctgggca tagwttatca ccacacggat gcagtcttgcctacacctat 420 gggacctcat caacactgtg ccgaagaatt agttctgtca cgagatcttacactcatcat 480 tgaacctggg agatccctca tagctaacac ttgctgcttc gtcaatagggtcactggtgt 540 taaatctaat ggtacaaaga atttcattgt agttgatggc agcatggcagagcttatcag 600 accaagtcta tatggagcat accagcatat cgaactggtt tctccttccccagatgcaga 660 agtagcaaca ttcgatattg ttggaccagt ttgtgaatct gcagatttccttggcaaaga 720 cagggaactt ccaacacctg ataagggagc tggtttggtg gttcatgacgcaggagccta 780 ctgcatgagc atggcttcaa cctacaactt gaagttgcga ccacctgaatattgggtaga 840 agatgatggg tccattgcta agattcggcg tggagagtca tttgatgactacatgaagtt 900 ctttgataat ctctctgcct aactcgtttt cctgcaattg taataagatttttctcttgt 960 tatgtgtggc tgtatcagga ttcggattga tagcgcagta cagtttgctgtagaatcggt 1020 attttttttt attgtactgt gatgtcggta ccttatttta tccaaagatttttggcaaat 1080 tttgctacag gacacttaaa aaaaaaaaaa aaaaaa 1116 13 306 PRTOryza sativa UNSURE (128) Xaa = ANY AMINO ACID 13 Leu His Gly Val PheVal Asn Ile Asp Ser Glu Phe Asp Leu Glu Asn 1 5 10 15 Ile Val Thr AlaAla Arg Val Ala Gly Lys Lys Val Pro Val Leu Leu 20 25 30 Arg Ile Asn ProAsp Val Asp Pro Gln Val His Pro Tyr Val Ala Thr 35 40 45 Gly Asn Lys ThrSer Lys Phe Gly Ile Arg Asn Glu Lys Leu Gln Trp 50 55 60 Phe Leu Asp SerIle Lys Ser Tyr Ser Asn Asp Ile Thr Leu Val Gly 65 70 75 80 Val His CysHis Leu Gly Ser Thr Ile Thr Lys Val Asp Ile Phe Arg 85 90 95 Asp Ala AlaGly Leu Met Val Asn Tyr Val Asp Glu Ile Arg Ala Gln 100 105 110 Gly PheGlu Leu Glu Tyr Leu Asn Ile Gly Gly Gly Leu Gly Ile Xaa 115 120 125 TyrHis His Thr Asp Ala Val Leu Pro Thr Pro Met Gly Pro His Gln 130 135 140His Cys Ala Glu Glu Leu Val Leu Ser Arg Asp Leu Thr Leu Ile Ile 145 150155 160 Glu Pro Gly Arg Ser Leu Ile Ala Asn Thr Cys Cys Phe Val Asn Arg165 170 175 Val Thr Gly Val Lys Ser Asn Gly Thr Lys Asn Phe Ile Val ValAsp 180 185 190 Gly Ser Met Ala Glu Leu Ile Arg Pro Ser Leu Tyr Gly AlaTyr Gln 195 200 205 His Ile Glu Leu Val Ser Pro Ser Pro Asp Ala Glu ValAla Thr Phe 210 215 220 Asp Ile Val Gly Pro Val Cys Glu Ser Ala Asp PheLeu Gly Lys Asp 225 230 235 240 Arg Glu Leu Pro Thr Pro Asp Lys Gly AlaGly Leu Val Val His Asp 245 250 255 Ala Gly Ala Tyr Cys Met Ser Met AlaSer Thr Tyr Asn Leu Lys Leu 260 265 270 Arg Pro Pro Glu Tyr Trp Val GluAsp Asp Gly Ser Ile Ala Lys Ile 275 280 285 Arg Arg Gly Glu Ser Phe AspAsp Tyr Met Lys Phe Phe Asp Asn Leu 290 295 300 Ser Ala 305 14 968 DNAGlycine max 14 gttgccactg ggaataagaa ctctaaattt ggcattagaa atgagaagctgcagtgcttt 60 ttagatgcag tgaaggaaca tcctaatgag ctcaaacttg taggggcccactgccatctt 120 ggttcaacaa ttaccaaggt tgacattttc agggatgcag ccaccattatgatcaactac 180 attgaccaaa tccgagatca gggttttgaa gttgattact taaatattggtggaggactt 240 gggatagatt attatcattc tggtgccatc cttcctacac ctagagatctcattgacact 300 gtacgagatc ttgttatttc acgtggtctt aatctcatca ttgaaccaggaagatcactc 360 attgcaaaca cgtgttgctt agttaaccgg gtgacaggtg ttaaaactaatggatctaaa 420 aacttcattg taattgatgg aagtatggct gaacttatcc gccctagtctttatgatgct 480 taccagcata tagagctggt ttcccctgcc ccgtcaaatg ctgaaacagaaacttttgat 540 gtggttggcc ctgtctgtga gtctgcagat ttcttaggaa aaggaagagaacttcctact 600 ccagccaagg gtactggttt ggttgttcat gatgctggtg cttattgcatgagcatggca 660 tcaacctaca atctaaagat gcggcctcct gagtattggg ttgaagatgatggatcagtg 720 agcaaaataa gacatggaga gacttttgaa gaccacattc ggttttttgaggggctttga 780 gctaataatt tatcttgtag gaaagaaggc tggagaattg ttatgtacttggagtttgaa 840 tctttcctcg tcaatgaatg catgactctt gtagttctgt ttcttccgttctaattgaat 900 gttgactccc atgacaggaa cagagaataa agttgatttc agttagatttaaaaaaaaaa 960 aaaaaaaa 968 15 259 PRT Glycine max 15 Val Ala Thr GlyAsn Lys Asn Ser Lys Phe Gly Ile Arg Asn Glu Lys 1 5 10 15 Leu Gln CysPhe Leu Asp Ala Val Lys Glu His Pro Asn Glu Leu Lys 20 25 30 Leu Val GlyAla His Cys His Leu Gly Ser Thr Ile Thr Lys Val Asp 35 40 45 Ile Phe ArgAsp Ala Ala Thr Ile Met Ile Asn Tyr Ile Asp Gln Ile 50 55 60 Arg Asp GlnGly Phe Glu Val Asp Tyr Leu Asn Ile Gly Gly Gly Leu 65 70 75 80 Gly IleAsp Tyr Tyr His Ser Gly Ala Ile Leu Pro Thr Pro Arg Asp 85 90 95 Leu IleAsp Thr Val Arg Asp Leu Val Ile Ser Arg Gly Leu Asn Leu 100 105 110 IleIle Glu Pro Gly Arg Ser Leu Ile Ala Asn Thr Cys Cys Leu Val 115 120 125Asn Arg Val Thr Gly Val Lys Thr Asn Gly Ser Lys Asn Phe Ile Val 130 135140 Ile Asp Gly Ser Met Ala Glu Leu Ile Arg Pro Ser Leu Tyr Asp Ala 145150 155 160 Tyr Gln His Ile Glu Leu Val Ser Pro Ala Pro Ser Asn Ala GluThr 165 170 175 Glu Thr Phe Asp Val Val Gly Pro Val Cys Glu Ser Ala AspPhe Leu 180 185 190 Gly Lys Gly Arg Glu Leu Pro Thr Pro Ala Lys Gly ThrGly Leu Val 195 200 205 Val His Asp Ala Gly Ala Tyr Cys Met Ser Met AlaSer Thr Tyr Asn 210 215 220 Leu Lys Met Arg Pro Pro Glu Tyr Trp Val GluAsp Asp Gly Ser Val 225 230 235 240 Ser Lys Ile Arg His Gly Glu Thr PheGlu Asp His Ile Arg Phe Phe 245 250 255 Glu Gly Leu 16 676 DNA Triticumaestivum unsure (373) n = A, C, G or T 16 tttgagttgg agtacctgaatattggaggt ggtttgggga tagactacca ccacactggt 60 gcagtcttgc ctacacctatggatcttatc aacactgtcc gggaattggt cctctcacgg 120 gatcttactc tcattattgaacctggaaga tccctgatcg ccaatacttg ctgcttcgtc 180 aataaggtca ctggtgtaaaatcgaatggc acgaagaatt tcattgtagt tgatggcagc 240 atggccgagc tcatcaggcctagtctatat ggagcatatc agcatataga actagttctc 300 cctctccaag gtgcagaagtagcaaccttc cgatattgtt ggggccagtc tgcgaatctg 360 cagattcctt ggnaaagacaaggagttcca acacctgaca aggganctgg tttgggtgtc 420 cacgacgcan ganctactgcatgagcatgg cttcnaccta caacctgaag atgaggcaac 480 cgagtattgg gtanaggacatggnccatgt aagataagca cggggaaaca ttgacgacac 540 atgagtcttg atngctccgccaggccttta ctggttggna acnagcttca ttgtnnccac 600 cgtggaatct gggaacatcntgttgtagtg gcaccacana gggnttttgn gacaatcaca 660 ntagatgaga ttntgg 676 1773 PRT Triticum aestivum 17 Pro Thr Pro Met Asp Leu Ile Asn Thr Val ArgGlu Leu Val Leu Ser 1 5 10 15 Arg Asp Leu Thr Leu Ile Ile Glu Pro GlyArg Ser Leu Ile Ala Asn 20 25 30 Thr Cys Cys Phe Val Asn Lys Val Thr GlyVal Lys Ser Asn Gly Thr 35 40 45 Lys Asn Phe Ile Val Val Asp Gly Ser MetAla Glu Leu Ile Arg Pro 50 55 60 Ser Leu Tyr Gly Ala Tyr Gln His Ile 6570 18 544 DNA Glycine max unsure (465) n = A, C, G or T 18 ttgcaacacacattgtcttg tcggcaaaat cttccaccaa caacacacag ccatggcagg 60 ctcaaacattctttctcact ctccttccct tcccaaaacc tacagccact ccttaaacca 120 aaacgcgttatcccaaaagc ttttttttct gcccctcaaa ttcaaagcca ccacaaaacc 180 acgtgctctcagagcggttc tctcgcagaa cgctgtcaaa acctcggtgg aggacacaaa 240 gaacgctcattttcagcact gtttcaccaa atccgaagat gggtatctgt actgtgaggg 300 cctcaaggtgcatgacatca tggaatctgt tgagagaaga cctttctatt tgtacagcaa 360 gccccagataactaggaatg ttgaagccta caaggatgca ttggaagggt tgaactccat 420 aattggttatgccattaagg ccaataataa cttgaagatt ttggnacatt tgaggcactt 480 gggttgtggtgctgtgcttg ttagtgggaa tgagctgaag ttgntcttcg agctggnttt 540 gttc 544 1962 PRT Glycine max UNSURE (44) Xaa = ANY AMINO ACID 19 Arg Arg Pro PheTyr Leu Tyr Ser Lys Pro Gln Ile Thr Arg Asn Val 1 5 10 15 Glu Ala TyrLys Asp Ala Leu Glu Gly Leu Asn Ser Ile Ile Gly Tyr 20 25 30 Ala Ile LysAla Asn Asn Asn Leu Lys Ile Leu Xaa His Leu Arg His 35 40 45 Leu Gly CysGly Ala Val Leu Val Ser Gly Asn Glu Leu Lys 50 55 60 20 371 PRTPseudomonas aeruginosa 20 Met Lys Arg Val Gly Leu Ile Gly Trp Arg GlyMet Val Gly Ser Val 1 5 10 15 Leu Ile Gln Arg Met Leu Glu Glu Arg AspPhe Asp Leu Ile Glu Pro 20 25 30 Val Phe Phe Thr Thr Ser Asn Val Gly AlaGln Ala Pro Glu Val Asp 35 40 45 Lys Asp Ile Ala Pro Leu Lys Asp Ala TyrSer Ile Asp Glu Leu Lys 50 55 60 Thr Leu Asp Val Ile Leu Thr Cys Gln GlyGly Asp Tyr Thr Ser Glu 65 70 75 80 Val Phe Pro Lys Leu Arg Glu Ala GlyTrp Gln Gly Tyr Trp Ile Asp 85 90 95 Ala Ala Ser Ser Leu Arg Met Glu AspAsp Ala Val Ile Val Leu Asp 100 105 110 Pro Val Asn Arg Lys Val Ile AspGln Ala Leu Asp Ala Gly Thr Arg 115 120 125 Asn Tyr Ile Gly Gly Asn CysThr Val Ser Leu Met Leu Met Ala Leu 130 135 140 Gly Gly Leu Phe Asp AlaGly Leu Val Glu Trp Met Ser Ala Met Thr 145 150 155 160 Tyr Gln Ala AlaSer Gly Ala Gly Ala Gln Asn Met Arg Asp Leu Leu 165 170 175 Lys Gln MetGly Ala Ala His Ala Ser Val Ala Asp Asp Leu Ala Asn 180 185 190 Pro AlaSer Ala Ile Leu Asp Ile Asp Arg Lys Val Ala Glu Thr Leu 195 200 205 ArgSer Glu Ala Phe Pro Thr Glu His Phe Gly Ala Pro Leu Gly Gly 210 215 220Ser Leu Ile Pro Trp Ile Asp Lys Glu Leu Ser Gln Arg Arg Gln Ser 225 230235 240 Arg Glu Glu Trp Lys Ala Gln Ala Glu Thr Asn Lys Ile Leu Ala Arg245 250 255 Phe Lys Asn Pro Ile Pro Val Asp Gly Ile Cys Val Arg Val GlyAla 260 265 270 Met Arg Cys His Ser Gln Ala Leu Thr Ile Lys Leu Asn LysAsp Val 275 280 285 Pro Leu Thr Asp Ile Glu Gly Leu Ile Arg Gln His AsnPro Trp Val 290 295 300 Lys Leu Val Pro Asn His Arg Glu Val Ser Val ArgGlu Leu Thr Pro 305 310 315 320 Ala Ala Val Thr Gly Thr Leu Ser Val ProVal Gly Arg Leu Arg Lys 325 330 335 Leu Asn Met Val Ser Gln Tyr Leu GlyAla Phe Thr Val Gly Asp Gln 340 345 350 Leu Leu Trp Gly Ala Ala Glu ProLeu Arg Arg Met Leu Arg Ile Leu 355 360 365 Leu Glu Arg 370 21 788 DNAZea mays 21 cgacaacatc gcccccgcca tcctcggcgg cttcgtcctc gtccgcagctacgacccctt 60 tcacctcgtc ccgctttcct tcccgccagc gctccgcctc cacttcgtcctggtcacccc 120 cgacttcgag gcgcccacga gcaagatgcg cgccgcgctg cccaggcaggtcgacgtcca 180 gcagcacgtg cgcaactcca gccaggcagc ggcgctcgtg gcggcggtgctgcaggggga 240 cgcgggcctc atcggctccg cgatgtcgtc cgacggcatc gtggagcccaccagggcacc 300 cctcatacct ggcatggcgg ccgtaaaggc ggcggccctg caagctggagcgctgggctg 360 cacaattagc ggcgcgggcc ccacagtggt ggccgtcatc caaggggaggaaagggggga 420 ggaggttgcc cgcaagatgg tggacgcgtt ctggagcgca ggcaagctcaaggcgacagc 480 aaccgtcgcg cagctcgata cccttggtgc cagggtcatc gccacgtcatccttgaacta 540 gcaaaagatt cggaaagtgg tactgcaatt gtatcaccaa acaaggaagaatgaagggga 600 accccatgga tttgtatgtt ttctcttctt tcttgcatct ttaggtggttaattggcttt 660 ggaataaatg agatggagga catcgctaga acaattctgt tccgtgggctgtaatttcaa 720 tttgggctgg tttctttatc atgccatgga taattatgaa taaatttgaggtagtttgtt 780 aaaaaaaa 788 22 179 PRT Zea mays 22 Asp Asn Ile Ala ProAla Ile Leu Gly Gly Phe Val Leu Val Arg Ser 1 5 10 15 Tyr Asp Pro PheHis Leu Val Pro Leu Ser Phe Pro Pro Ala Leu Arg 20 25 30 Leu His Phe ValLeu Val Thr Pro Asp Phe Glu Ala Pro Thr Ser Lys 35 40 45 Met Arg Ala AlaLeu Pro Arg Gln Val Asp Val Gln Gln His Val Arg 50 55 60 Asn Ser Ser GlnAla Ala Ala Leu Val Ala Ala Val Leu Gln Gly Asp 65 70 75 80 Ala Gly LeuIle Gly Ser Ala Met Ser Ser Asp Gly Ile Val Glu Pro 85 90 95 Thr Arg AlaPro Leu Ile Pro Gly Met Ala Ala Val Lys Ala Ala Ala 100 105 110 Leu GlnAla Gly Ala Leu Gly Cys Thr Ile Ser Gly Ala Gly Pro Thr 115 120 125 ValVal Ala Val Ile Gln Gly Glu Glu Arg Gly Glu Glu Val Ala Arg 130 135 140Lys Met Val Asp Ala Phe Trp Ser Ala Gly Lys Leu Lys Ala Thr Ala 145 150155 160 Thr Val Ala Gln Leu Asp Thr Leu Gly Ala Arg Val Ile Ala Thr Ser165 170 175 Ser Leu Asn 23 601 DNA Oryza sativa unsure (433) n = A, C, Gor T 23 gtcgccgcca tcgctgccct tcgcgccctc gatgtcaagt cccacgccgtctccatccac 60 ctcaccaagg gcctccccct cggctccggc ctcggctcct ccgccgcctccgccgccgcc 120 gctgccaagg ccgttgacgc cctcttcggc tccctcctac accaagatgacctcgtcctc 180 gcgggcctcg agtccgagaa agccgtcagt ggcttccacg ccgacaacatcgccccggcc 240 atcctcggcg gcttcgtcct cgtccgcagc tacgacccct tccacctcatcccgctctcc 300 tccccacctg ccctccgcct ccacttcgtc ctcgtcacgc ccgacttcgaggcgcccacc 360 aagcaagatg cgtgccgcgc tgcccaaaca ggtggccgtc caccaagcacgtccgcaact 420 ccagccaagc ggncgcgctt gtcgccgctg tgctgcaagg ggacgccaccctcatcggct 480 ccgcaatgtc ctccgacggc atcgtggagc caacaaggcg ccgctgattctggatggctg 540 cggtcaaagg cgccggcttg gaactggggg aattggctgc acatcagtggagaaggcaan 600 t 601 24 82 PRT Oryza sativa UNSURE (56) (57) Xaa = ANYAMINO ACID 24 Val Ser Ile His Leu Thr Lys Gly Leu Pro Leu Gly Ser GlyLeu Gly 1 5 10 15 Ser Ser Ala Ala Ser Ala Ala Ala Ala Ala Lys Ala ValAsp Ala Leu 20 25 30 Phe Gly Ser Leu Leu His Gln Asp Asp Leu Val Leu AlaGly Leu Glu 35 40 45 Ser Glu Lys Ala Val Ser Gly Xaa Xaa His Ala Asp AsnIle Ala Pro 50 55 60 Ala Ile Leu Gly Gly Phe Val Leu Val Arg Ser Tyr AspPro Phe His 65 70 75 80 Leu Ile 25 1543 DNA Glycine max 25 gaagagagacaaaccagcaa gagtggagat ggcgacgtcg acgtgcttcc tgtgtccgtc 60 tacggcgagtttgaaaggca gggccagatt cagaatcaga atcagatgca gcagcagcgt 120 gtcggtcaatattcgaaggg agcccgaacc tgtaacgacg ctggtgaaag cgtttgctcc 180 cgccacggtggcgaatctag gtccaggctt cgacttccta ggctgcgccg tggacggact 240 cggagacattgtgtcggtga aggttgaccc acaggttcac cctggcgaga tatgcatatc 300 cgacatcagcggccacgccc caaacaagct cagcaaaaac cctctctgga actgcgccgg 360 catcgccgccattgaagtca tgaaaatgct ctccattcga tccgtcggcc tctccctctc 420 cctggagaagggcctgcctt tgggaagcgg tctgggatcc agcgccgcca gcgccgccgc 480 ggccgccgtggcggtgaacg agctgtttgg gaagaaatta agcgtggagg agctggttct 540 ggcatcactgaaatcggaag agaaggtgtc ggggtatcac gcggacaacg tggcgccatc 600 gataatggggggttttgtgc tgatcgggag ctactcgccg ctggagttga tgccgttgaa 660 gtttccggcagagaaggagc tgtatttcgt gctggtgacg cctgagttcg aggccccgac 720 gaagaagatgcgggcagcgc tgcctacgga gatcgggatg ccgcaccacg tgtggaactg 780 cagccaggcaggtgctctgg tggcgtcggt gctgcagggc gacgtggttg ggttggggaa 840 ggcattgtcctctgacaaga tcgttgagcc aaggcgtgcc cccttgattc ctggcatgga 900 ggctgtcaagagggctgcca ttcaggccgg tgcttttggc tgtaccatca gcggcgccgg 960 ccctaccgccgtcgccgtca ttgacgacga gcaaactgga cacctcattg ccaaacacat 1020 gattgacgcttttctccatg ttggcaattt gaaggcttct gcaaatgtca agcagcttga 1080 tcgccttggtgctagacgca ttccaaattg aaccttctct tctctatctc tatgagaggc 1140 ttgtagatttcaagaaccgg atttcttcca acttgctcgt aacactctaa gtgctgaccg 1200 gtcacatgtatttgaaattt gatctgatca atgaagcagc attctagtgt ggaggtctga 1260 ataacaagagaaacattaaa cccaagctgg gagctctgtt tgggtggtgg aaatttaaat 1320 agatgaataattatgaaaga cctagatcag gtcagtgtta tggtgaactc tgaagcatgt 1380 tttagattttctttgctttg tttttatcat atttttatct tgctacttga gttgacaaag 1440 ctcaaaaagaagtcattttt agtattttct tgtttcatta tgctagttaa tcttagcttt 1500 tgaatagcatgtattgttcc ttaaaaaaaa aaaaaaaaaa aaa 1543 26 483 PRT Glycine max 26 MetAla Thr Ser Thr Cys Phe Leu Cys Pro Ser Thr Ala Ser Leu Lys 1 5 10 15Gly Arg Ala Arg Phe Arg Ile Arg Ile Arg Cys Ser Ser Ser Val Ser 20 25 30Val Asn Ile Arg Arg Glu Pro Glu Pro Val Thr Thr Leu Val Lys Ala 35 40 45Phe Ala Pro Ala Thr Val Ala Asn Leu Gly Pro Gly Phe Asp Phe Leu 50 55 60Gly Cys Ala Val Asp Gly Leu Gly Asp Ile Val Ser Val Lys Val Asp 65 70 7580 Pro Gln Val His Pro Gly Glu Ile Cys Ile Ser Asp Ile Ser Gly His 85 9095 Ala Pro Asn Lys Leu Ser Lys Asn Pro Leu Trp Asn Cys Ala Gly Ile 100105 110 Ala Ala Ile Glu Val Met Lys Met Leu Ser Ile Arg Ser Val Gly Leu115 120 125 Ser Leu Ser Leu Glu Lys Gly Leu Pro Leu Gly Ser Gly Leu GlySer 130 135 140 Ser Ala Ala Ser Ala Ala Ala Ala Ala Val Ala Val Asn GluLeu Phe 145 150 155 160 Gly Lys Lys Leu Ser Val Glu Glu Leu Val Leu AlaSer Leu Lys Ser 165 170 175 Glu Glu Lys Val Ser Gly Tyr His Ala Asp AsnVal Ala Pro Ser Ile 180 185 190 Met Gly Gly Phe Val Leu Ile Gly Ser TyrSer Pro Leu Glu Leu Met 195 200 205 Pro Leu Lys Phe Pro Ala Glu Lys GluLeu Tyr Phe Val Leu Val Thr 210 215 220 Pro Glu Phe Glu Ala Pro Thr LysLys Met Arg Ala Ala Leu Pro Thr 225 230 235 240 Glu Ile Gly Met Pro HisHis Val Trp Asn Cys Ser Gln Ala Gly Ala 245 250 255 Leu Val Ala Ser ValLeu Gln Gly Asp Val Val Gly Leu Gly Lys Ala 260 265 270 Leu Ser Ser AspLys Ile Val Glu Pro Arg Arg Ala Pro Leu Ile Pro 275 280 285 Gly Met GluAla Val Lys Arg Ala Ala Ile Gln Ala Gly Ala Phe Gly 290 295 300 Cys ThrIle Ser Gly Ala Gly Pro Thr Ala Val Ala Val Ile Asp Asp 305 310 315 320Glu Gln Thr Gly His Leu Ile Ala Lys His Met Ile Asp Ala Phe Leu 325 330335 His Val Gly Asn Leu Lys Ala Ser Ala Asn Val Lys Gln Leu Asp Arg 340345 350 Leu Gly Ala Arg Arg Ile Pro Asn Thr Phe Ser Ser Leu Ser Leu Glu355 360 365 Ala Cys Arg Phe Gln Glu Pro Asp Phe Phe Gln Leu Ala Arg AsnThr 370 375 380 Leu Ser Ala Asp Arg Ser His Val Phe Glu Ile Ser Asp GlnSer Ser 385 390 395 400 Ile Leu Val Trp Arg Ser Glu Gln Glu Lys His ThrGln Ala Gly Ser 405 410 415 Ser Val Trp Val Val Glu Ile Ile Asp Glu LeuLys Thr Ile Arg Ser 420 425 430 Val Leu Trp Thr Leu Lys His Val Leu AspPhe Leu Cys Phe Val Phe 435 440 445 Ile Ile Phe Leu Ser Cys Tyr Leu SerGln Ser Ser Lys Arg Ser His 450 455 460 Phe Tyr Phe Leu Val Ser Leu CysLeu Ile Leu Ala Phe Glu His Val 465 470 475 480 Leu Phe Leu 27 438 DNATriticum aestivum unsure (271) n = A, C, G or T 27 ctcgagtcgg agaaggccgtcagcggcttc cacgccgaca acatcgcccc cgccatcctc 60 ggcggcttcg tcctcgtccgcagctacgac ccctttcacc tcgtcccgct ttccttcccg 120 ccagcgctcc gcctccacttcgtcctggtc acccccgact tcgaggcgcc cacgagcaag 180 atgcgcgccg cgctgcccaggcaggtcgac gtccagcagc acgtgcgcaa ctccagccag 240 gcagcggcgc tccgtggcggcggtgctgca nggggacgcc gggctcatcg gtccgcgatt 300 tctccgacgg gcatcgtggacccaccaagg aaccctcata cctggcatgg cggccgtaaa 360 ggcggcggcc tgcaactggacgctgggtgc acattaacgg gcgggcccac atggtggctc 420 ncagngaaga gaggggag 43828 84 PRT Triticum aestivum 28 Leu Glu Ser Glu Lys Ala Val Ser Gly PheHis Ala Asp Asn Ile Ala 1 5 10 15 Pro Ala Ile Leu Gly Gly Phe Val LeuVal Arg Ser Tyr Asp Pro Phe 20 25 30 His Leu Val Pro Leu Ser Phe Pro ProAla Leu Arg Leu His Phe Val 35 40 45 Leu Val Thr Pro Asp Phe Glu Ala ProThr Ser Lys Met Arg Ala Ala 50 55 60 Leu Pro Arg Gln Val Asp Val Gln GlnHis Val Arg Asn Ser Ser Gln 65 70 75 80 Ala Ala Ala Leu 29 300 PRTMethanococcus jannashii 29 Met Arg Glu Ile Met Lys Val Arg Val Lys AlaPro Cys Thr Ser Ala 1 5 10 15 Asn Leu Gly Val Gly Phe Asp Val Phe GlyLeu Cys Leu Lys Glu Pro 20 25 30 Tyr Asp Val Ile Glu Val Glu Ala Ile AspAsp Lys Glu Ile Ile Ile 35 40 45 Glu Val Asp Asp Lys Asn Ile Pro Thr AspPro Asp Lys Asn Val Ala 50 55 60 Gly Ile Val Ala Lys Lys Met Ile Asp AspPhe Asn Ile Gly Lys Gly 65 70 75 80 Val Lys Ile Thr Ile Lys Lys Gly ValLys Ala Gly Ser Gly Leu Gly 85 90 95 Ser Ser Ala Ala Ser Ser Ala Gly ThrAla Tyr Ala Ile Asn Glu Leu 100 105 110 Phe Lys Leu Asn Leu Asp Lys LeuLys Leu Val Asp Tyr Ala Ser Tyr 115 120 125 Gly Glu Leu Ala Ser Ser GlyAla Lys His Ala Asp Asn Val Ala Pro 130 135 140 Ala Ile Phe Gly Gly PheThr Met Val Thr Asn Tyr Glu Pro Leu Glu 145 150 155 160 Val Leu His IlePro Ile Asp Phe Lys Leu Asp Ile Leu Ile Ala Ile 165 170 175 Pro Asn IleSer Ile Asn Thr Lys Glu Ala Arg Glu Ile Leu Pro Lys 180 185 190 Ala ValGly Leu Lys Asp Leu Val Asn Asn Val Gly Lys Ala Cys Gly 195 200 205 MetVal Tyr Ala Leu Tyr Asn Lys Asp Lys Ser Leu Phe Gly Arg Tyr 210 215 220Met Met Ser Asp Lys Val Ile Glu Pro Val Arg Gly Lys Leu Ile Pro 225 230235 240 Asn Tyr Phe Lys Ile Lys Glu Glu Val Lys Asp Lys Val Tyr Gly Ile245 250 255 Thr Ile Ser Gly Ser Gly Pro Ser Ile Ile Ala Phe Pro Lys GluGlu 260 265 270 Phe Ile Asp Glu Val Glu Asn Ile Leu Arg Asp Tyr Tyr GluAsn Thr 275 280 285 Ile Arg Thr Glu Val Gly Lys Gly Val Glu Val Val 290295 300 30 1362 DNA Glycine max 30 actttgtagt tcgtagatag ccgatgtgcttgtcttagtg tgtcagtcat tcctgttcct 60 caagtcaagc tttgtagtga gcagatataatggctgttga aaggtccgga attgccaaag 120 atgttacgga attgattggt aaaaccccattagtatatct aaataaactt gcggatggtt 180 gtgttgcccg ggttgctgct aaactggagttgatggagcc atgctctagt gtgaaggaca 240 ggattgggta tagtatgatt gctgatgcagaagagaaggg acttatcaca cctggaaaga 300 gtgtcctcat tgagccaaca agtggtaatactggcattgg attagccttc atggcagcag 360 ccaggggtta caagctcata attacaatgcctgcttctat gagtcttgag agaagaatca 420 ttctattagc ttttggagct gagttggttctgacagatcc tgctaaggga atgaaaggtg 480 ctgttcagaa ggctgaagag atattggctaagacgcccaa tgcctacata cttcaacaat 540 ttgaaaaccc tgccaatccc aaggttcattatgaaaccac tggtccagag atatggaaag 600 gctccgatgg gaaaattgat gcatttgtttctgggatagg cactggtggt acaataacag 660 gtgctggaaa atatcttaaa gagcagaatccgaatataaa gctgattggt gtggaaccag 720 ttgaaagtcc agtgctctca ggaggaaagcctggtccaca caagattcaa gggattggtg 780 ctggttttat ccctggtgtc ttggaagtcaatcttcttga tgaagttgtt caaatatcaa 840 gtgatgaagc aatagaaact gcaaagcttcttgcgcttaa agaaggccta tttgtgggaa 900 tatcttccgg agctgcagct gctgctgcttttcagattgc aaaaagacca gaaaatgccg 960 ggaagcttat tgttgccgtt tttcccagcttcggggagag gtacctgtcc tccgtgctat 1020 ttgagtcagt gagacgcgaa gctgaaagcatgacttttga gccctgaatt cccgtttaag 1080 gctctcacta ctgaattttc ttgttacttgtaccaggctt taactagatt gttagagtac 1140 tactgtttgt gactctgact ctaaaataaaacttgctcca aaagactagt ttttcttgat 1200 gcccctggag cgataatttt gtgcctgcaacattaaaaag tattcaaagt tgcttataag 1260 taacatgttt catcttttgt tgttgttgagacgaacacgg atgaggtcat aatactatgt 1320 ttctgatttc ctttggtagg gaaaaaaaaaaaaaaaaaaa aa 1362 31 325 PRT Glycine max 31 Met Ala Val Glu Arg Ser GlyIle Ala Lys Asp Val Thr Glu Leu Ile 1 5 10 15 Gly Lys Thr Pro Leu ValTyr Leu Asn Lys Leu Ala Asp Gly Cys Val 20 25 30 Ala Arg Val Ala Ala LysLeu Glu Leu Met Glu Pro Cys Ser Ser Val 35 40 45 Lys Asp Arg Ile Gly TyrSer Met Ile Ala Asp Ala Glu Glu Lys Gly 50 55 60 Leu Ile Thr Pro Gly LysSer Val Leu Ile Glu Pro Thr Ser Gly Asn 65 70 75 80 Thr Gly Ile Gly LeuAla Phe Met Ala Ala Ala Arg Gly Tyr Lys Leu 85 90 95 Ile Ile Thr Met ProAla Ser Met Ser Leu Glu Arg Arg Ile Ile Leu 100 105 110 Leu Ala Phe GlyAla Glu Leu Val Leu Thr Asp Pro Ala Lys Gly Met 115 120 125 Lys Gly AlaVal Gln Lys Ala Glu Glu Ile Leu Ala Lys Thr Pro Asn 130 135 140 Ala TyrIle Leu Gln Gln Phe Glu Asn Pro Ala Asn Pro Lys Val His 145 150 155 160Tyr Glu Thr Thr Gly Pro Glu Ile Trp Lys Gly Ser Asp Gly Lys Ile 165 170175 Asp Ala Phe Val Ser Gly Ile Gly Thr Gly Gly Thr Ile Thr Gly Ala 180185 190 Gly Lys Tyr Leu Lys Glu Gln Asn Pro Asn Ile Lys Leu Ile Gly Val195 200 205 Glu Pro Val Glu Ser Pro Val Leu Ser Gly Gly Lys Pro Gly ProHis 210 215 220 Lys Ile Gln Gly Ile Gly Ala Gly Phe Ile Pro Gly Val LeuGlu Val 225 230 235 240 Asn Leu Leu Asp Glu Val Val Gln Ile Ser Ser AspGlu Ala Ile Glu 245 250 255 Thr Ala Lys Leu Leu Ala Leu Lys Glu Gly LeuPhe Val Gly Ile Ser 260 265 270 Ser Gly Ala Ala Ala Ala Ala Ala Phe GlnIle Ala Lys Arg Pro Glu 275 280 285 Asn Ala Gly Lys Leu Ile Val Ala ValPhe Pro Ser Phe Gly Glu Arg 290 295 300 Tyr Leu Ser Ser Val Leu Phe GluSer Val Arg Arg Glu Ala Glu Ser 305 310 315 320 Met Thr Phe Glu Pro 32532 325 PRT Citrullus lanatus 32 Met Ala Asp Ala Lys Ser Thr Ile Ala LysAsp Val Thr Glu Leu Ile 1 5 10 15 Gly Asn Thr Pro Leu Val Tyr Leu AsnArg Val Val Asp Gly Cys Val 20 25 30 Ala Arg Val Ala Ala Lys Leu Glu MetMet Glu Pro Cys Ser Ser Val 35 40 45 Lys Asp Arg Ile Gly Tyr Ser Met IleSer Asp Ala Glu Asn Lys Gly 50 55 60 Leu Ile Thr Pro Gly Glu Ser Val LeuIle Glu Pro Thr Ser Gly Asn 65 70 75 80 Thr Gly Ile Gly Leu Ala Phe IleAla Ala Ala Lys Gly Tyr Arg Leu 85 90 95 Ile Ile Cys Met Pro Ala Ser MetSer Leu Glu Arg Arg Thr Ile Leu 100 105 110 Arg Ala Phe Gly Ala Glu LeuVal Leu Thr Asp Pro Ala Arg Gly Met 115 120 125 Lys Gly Ala Val Gln LysAla Glu Glu Ile Lys Ala Lys Thr Pro Asn 130 135 140 Ser Tyr Ile Leu GlnGln Phe Glu Asn Pro Ala Asn Pro Lys Ile His 145 150 155 160 Tyr Glu ThrThr Gly Pro Glu Ile Trp Arg Gly Ser Gly Gly Lys Ile 165 170 175 Asp AlaLeu Val Ser Gly Ile Gly Thr Gly Gly Thr Val Thr Gly Ala 180 185 190 GlyLys Tyr Leu Lys Glu Gln Asn Pro Asn Ile Lys Leu Tyr Gly Val 195 200 205Glu Pro Val Glu Ser Ala Ile Leu Ser Gly Gly Lys Pro Gly Pro His 210 215220 Lys Ile Gln Gly Ile Gly Ala Gly Phe Ile Pro Gly Val Leu Asp Val 225230 235 240 Asn Leu Leu Asp Glu Val Ile Gln Val Ser Ser Glu Glu Ser IleGlu 245 250 255 Thr Ala Lys Leu Leu Ala Leu Lys Glu Gly Leu Leu Val GlyIle Ser 260 265 270 Ser Gly Ala Ala Ala Ala Ala Ala Ile Arg Ile Ala LysArg Pro Glu 275 280 285 Asn Ala Gly Lys Leu Ile Val Ala Val Phe Pro SerPhe Gly Glu Arg 290 295 300 Tyr Leu Ser Thr Val Leu Phe Glu Ser Val LysArg Glu Thr Glu Asn 305 310 315 320 Met Val Phe Glu Pro 325 33 789 DNAZea mays 33 atagcgcatt ctcatggtgc tcttgttttg gttgacaaca gcatcatgtctccagtgctc 60 tcccgtccta tagaactggg agctgatatc gtgatgcact cggctaccaaatttatagcg 120 ggacatagtg atcttatggc tggaattctt gcagtgaagg gtgagagtttggctaaagag 180 gtagggtttc tgcaaaatgc tgaagggtcg ggtctggcac cttttgactgctggctttgc 240 ttgaggggaa tcaaaaccat ggctctgcgg gtggagaaac aacaggctaatgcccagaag 300 attgctgaat tcctggcgtc tcacccgagg gtcaagcaag taaactacgctgggcttcct 360 gaccatcctg ggcgagcttt acactattcc caggcaaagg gagcgggctctgttctcagt 420 tttctcaccg gctcactggc cctctcaaag cacgtcgtgg agaccaccaagtacttcagc 480 gtaacagtca gcttcgggag cgtgaagtcc ctcatcagcc tgccgtgcttcatgtcccac 540 gcatcaatcc ctgcctcggt ccgcgaggag cgtggcctaa ccgacgacctcgtccggata 600 tcggtcggca tcgaggatgt cgaggacctc atcgccgatc tggaccgcgcgctcagaact 660 ggcccggtgt agacatcgcc gatccttagg tcatgtcaag ctatcttttgatgattcatt 720 ggttgactgc ttgcgtgatg ataataatgg gaatgttgct tggataaaaaaaaaaaaaaa 780 aaaactcga 789 34 223 PRT Zea mays 34 Ile Ala His Ser HisGly Ala Leu Val Leu Val Asp Asn Ser Ile Met 1 5 10 15 Ser Pro Val LeuSer Arg Pro Ile Glu Leu Gly Ala Asp Ile Val Met 20 25 30 His Ser Ala ThrLys Phe Ile Ala Gly His Ser Asp Leu Met Ala Gly 35 40 45 Ile Leu Ala ValLys Gly Glu Ser Leu Ala Lys Glu Val Gly Phe Leu 50 55 60 Gln Asn Ala GluGly Ser Gly Leu Ala Pro Phe Asp Cys Trp Leu Cys 65 70 75 80 Leu Arg GlyIle Lys Thr Met Ala Leu Arg Val Glu Lys Gln Gln Ala 85 90 95 Asn Ala GlnLys Ile Ala Glu Phe Leu Ala Ser His Pro Arg Val Lys 100 105 110 Gln ValAsn Tyr Ala Gly Leu Pro Asp His Pro Gly Arg Ala Leu His 115 120 125 TyrSer Gln Ala Lys Gly Ala Gly Ser Val Leu Ser Phe Leu Thr Gly 130 135 140Ser Leu Ala Leu Ser Lys His Val Val Glu Thr Thr Lys Tyr Phe Ser 145 150155 160 Val Thr Val Ser Phe Gly Ser Val Lys Ser Leu Ile Ser Leu Pro Cys165 170 175 Phe Met Ser His Ala Ser Ile Pro Ala Ser Val Arg Glu Glu ArgGly 180 185 190 Leu Thr Asp Asp Leu Val Arg Ile Ser Val Gly Ile Glu AspVal Glu 195 200 205 Asp Leu Ile Ala Asp Leu Asp Arg Ala Leu Arg Thr GlyPro Val 210 215 220 35 547 DNA Oryza sativa unsure (260) n = A, C, G orT 35 gccttatggc taagcttgag aaggcggatc aggcattctg cttcaccagt gggatggcag60 cactagctgc agtaacacac ctccttaagt ctggacaaga aatagttgct ggagaggaca 120tatatggtgg ctcagaccgt ctgctctcac aagttgcccc gagacatggg attgtagtaa 180aacgaattga tacaaccaaa attagtgagg taacttctgc aattggggcc ttggactaaa 240ctaagtatgg ctttgaaaan cccaccatcc ccgtcctaca aattactgga tataaagaaa 300atagcnagag atagtcatta caatggggct ccttgtttta agtagacaac agcacatgtc 360tccctgtgct ctcccngtcc tcntaaaact ttgggccaaa tatnggtttg caccccaagc 420aaccaattta tnctgggcat agcgtnctta tggcnnggat ccttgccggg aaggggtgaa 480agcacttggc taaagagatg cattcctcna aaanctgaag gntaagtttg gacattngat 540gccggtt 547 36 75 PRT Oryza sativa 36 Leu Met Ala Lys Leu Glu Lys AlaAsp Gln Ala Phe Cys Phe Thr Ser 1 5 10 15 Gly Met Ala Ala Leu Ala AlaVal Thr His Leu Leu Lys Ser Gly Gln 20 25 30 Glu Ile Val Ala Gly Glu AspIle Tyr Gly Gly Ser Asp Arg Leu Leu 35 40 45 Ser Gln Val Ala Pro Arg HisGly Ile Val Val Lys Arg Ile Asp Thr 50 55 60 Thr Lys Ile Ser Glu Val ThrSer Ala Ile Gly 65 70 75 37 1733 DNA Glycine max 37 caaagacggcattgaagttg aacaatccat cactaacaca agcgcagaca acaacataac 60 cctgctccaaacacatcaat ttcaataatg ttttcttctg caatttctca gaagcccttc 120 cttcagtccctcgtcattga tcgttacgct cagagcacaa ctgctgcaac caggtgggag 180 tgcttggggtttaacaagtc agaaaatttc agtaccaaga gagtgttgcg tgcagagggg 240 ttcaagttgaattgcttggt tgaaaataga gagatggaag tggagtcatc atcatcatct 300 ttggtggatgatgctgccat gagcttaagt gaagaggatt taggggagcc tagtatttca 360 acaatggtgatgaatttcga gagtaagttt gatccttttg gagcaattag taccccgctt 420 taccaaacggctacttttaa gcagccttct gcaatagaaa atggtcccta tgactatacc 480 agaagtggaaatcctactcg tgatgcttta gaaagtttac tagcaaagct tgataaagca 540 gatagagccctgtgcttcac cagtggaatg gctgctttga gtgctgttgt tcgtcttgtt 600 ggaactggtgaggaaattgt caccggagat gatgtatatg gtggctcaga taggttgctg 660 tctcaagtagttccaaggac tggaattgtg gtgaaacggg taaatacatg tgatctagat 720 gaggttgctgctgccattgg actcaggact aagcttgtgt ggcttgagag tccaaccaat 780 cctcggcttcaaatttctga tattcgaaaa atatcagaga tggctcattc acatggtgct 840 cttgtgttagtggacaatag tataatgtca cctgtgttgt ctcagccatt ggaacttgga 900 gcagatattgtcatgcactc agctacaaaa tttattgctg gacatagtga cattatggct 960 ggtgtgcttgctgtgaaggg tgaaaagttg ggaaaggaaa tgtatttctt gcaaaatgca 1020 gagggttcaggcttagcacc atttgactgt tggctttgtt tgcgaggaat caagacaatg 1080 gccctgcgaattgaaaagca acaggataac gcacagaaga ttgcagagtt ccttgcctcc 1140 catcctcgagtgaaggaagt gaattatgct ggcttgcctg gtcatcctgg tcgtgattta 1200 cactattctcaggcaaaggg tgcaggatct gtgcttagct tcttgactgg ttcattggca 1260 ctttcaaagcatattgttga aactaccaaa tacttcagta taaccgtcag ctttgggagt 1320 gtgaagtccctcattagcat gccatgcttt atgtcacatg caagcatacc tgctgcagtt 1380 cgcgaggccagaggtttaac tgaagatctt gtacgaatat ctgtgggaat tgaggatgtg 1440 aatgatctcattgctgatct tggcaatgca cttagaactg gacctcttta atgtcttctc 1500 cacccccccacccaaaaaga aaaaaattca tccttaagaa gttggattag catgttgagg 1560 atttgggagcattgctatcc tgtctttgga ttcttgagag tggaaacttg aagtgttgct 1620 tatgtgcatgtaataaaatc aatatttcct gtaattttgt tgtaacaatt gttatcctta 1680 ccttgcaatatcatgtcata caagttacta ttgaaaaaaa aaaaaaaaaa aaa 1733 38 467 PRT Glycinemax 38 Met Phe Ser Ser Ala Ile Ser Gln Lys Pro Phe Leu Gln Ser Leu Val 15 10 15 Ile Asp Arg Tyr Ala Gln Ser Thr Thr Ala Ala Thr Arg Trp Glu Cys20 25 30 Leu Gly Phe Asn Lys Ser Glu Asn Phe Ser Thr Lys Arg Val Leu Arg35 40 45 Ala Glu Gly Phe Lys Leu Asn Cys Leu Val Glu Asn Arg Glu Met Glu50 55 60 Val Glu Ser Ser Ser Ser Ser Leu Val Asp Asp Ala Ala Met Ser Leu65 70 75 80 Ser Glu Glu Asp Leu Gly Glu Pro Ser Ile Ser Thr Met Val MetAsn 85 90 95 Phe Glu Ser Lys Phe Asp Pro Phe Gly Ala Ile Ser Thr Pro LeuTyr 100 105 110 Gln Thr Ala Thr Phe Lys Gln Pro Ser Ala Ile Glu Asn GlyPro Tyr 115 120 125 Asp Tyr Thr Arg Ser Gly Asn Pro Thr Arg Asp Ala LeuGlu Ser Leu 130 135 140 Leu Ala Lys Leu Asp Lys Ala Asp Arg Ala Leu CysPhe Thr Ser Gly 145 150 155 160 Met Ala Ala Leu Ser Ala Val Val Arg LeuVal Gly Thr Gly Glu Glu 165 170 175 Ile Val Thr Gly Asp Asp Val Tyr GlyGly Ser Asp Arg Leu Leu Ser 180 185 190 Gln Val Val Pro Arg Thr Gly IleVal Val Lys Arg Val Asn Thr Cys 195 200 205 Asp Leu Asp Glu Val Ala AlaAla Ile Gly Leu Arg Thr Lys Leu Val 210 215 220 Trp Leu Glu Ser Pro ThrAsn Pro Arg Leu Gln Ile Ser Asp Ile Arg 225 230 235 240 Lys Ile Ser GluMet Ala His Ser His Gly Ala Leu Val Leu Val Asp 245 250 255 Asn Ser IleMet Ser Pro Val Leu Ser Gln Pro Leu Glu Leu Gly Ala 260 265 270 Asp IleVal Met His Ser Ala Thr Lys Phe Ile Ala Gly His Ser Asp 275 280 285 IleMet Ala Gly Val Leu Ala Val Lys Gly Glu Lys Leu Gly Lys Glu 290 295 300Met Tyr Phe Leu Gln Asn Ala Glu Gly Ser Gly Leu Ala Pro Phe Asp 305 310315 320 Cys Trp Leu Cys Leu Arg Gly Ile Lys Thr Met Ala Leu Arg Ile Glu325 330 335 Lys Gln Gln Asp Asn Ala Gln Lys Ile Ala Glu Phe Leu Ala SerHis 340 345 350 Pro Arg Val Lys Glu Val Asn Tyr Ala Gly Leu Pro Gly HisPro Gly 355 360 365 Arg Asp Leu His Tyr Ser Gln Ala Lys Gly Ala Gly SerVal Leu Ser 370 375 380 Phe Leu Thr Gly Ser Leu Ala Leu Ser Lys His IleVal Glu Thr Thr 385 390 395 400 Lys Tyr Phe Ser Ile Thr Val Ser Phe GlySer Val Lys Ser Leu Ile 405 410 415 Ser Met Pro Cys Phe Met Ser His AlaSer Ile Pro Ala Ala Val Arg 420 425 430 Glu Ala Arg Gly Leu Thr Glu AspLeu Val Arg Ile Ser Val Gly Ile 435 440 445 Glu Asp Val Asn Asp Leu IleAla Asp Leu Gly Asn Ala Leu Arg Thr 450 455 460 Gly Pro Leu 465 39 637DNA Triticum aestivum unsure (400) n = A, C, G or T 39 agcgtggccacgatactgac cagcttcgag aactcgttcg acaagtatgg ggctctcagc 60 acgccgctgtaccagacggc caccttcaag cagccttcag caaccgttaa tggagcttat 120 gattatactagaagtggcaa ccctactcgt gatgttctcc agagccttat ggctaagctc 180 gagaaggcagaccaagcatt ctgcttcact agtgggatgg catcactggg ctgcagtaac 240 acacctccttcaggctggac aagaaatagt tgctggagag gacatatatg gtggtctgat 300 cgtctgctctcacaagttgt cccaagaaat ggaattgtag taaaacgggt cgatacaact 360 aaaattaacgacgtgactgc tgcatcggac ccttgactan actagtttgg ttgaaancca 420 caatcctcgtcaacaattac tgtataagaa atctcaggga tactcatcca tggggactgg 480 tttggnggcaannttcatgt cccanggcta cctggccnat aaantggggn antatgggag 540 catcagtacaaattatnctg gcnatgtcta ggtggatctc ntaaggggaa nttggnagga 600 ttcttcaaaacctagtnggt tgacttatgt ggttgtt 637 40 131 PRT Triticum aestivum UNSURE(77) Xaa = ANY AMINO ACID 40 Ser Val Ala Thr Ile Leu Thr Ser Phe Glu AsnSer Phe Asp Lys Tyr 1 5 10 15 Gly Ala Leu Ser Thr Pro Leu Tyr Gln ThrAla Thr Phe Lys Gln Pro 20 25 30 Ser Ala Thr Val Asn Gly Ala Tyr Asp TyrThr Arg Ser Gly Asn Pro 35 40 45 Thr Arg Asp Val Leu Gln Ser Leu Met AlaLys Leu Glu Lys Ala Asp 50 55 60 Gln Ala Phe Cys Phe Thr Ser Gly Met AlaSer Leu Xaa Ala Val Thr 65 70 75 80 His Leu Leu Gln Ala Gly Gln Glu IleVal Ala Gly Glu Asp Ile Tyr 85 90 95 Gly Gly Xaa Asp Arg Leu Leu Ser GlnVal Val Pro Arg Asn Gly Ile 100 105 110 Val Val Lys Arg Val Asp Thr ThrLys Ile Asn Asp Val Thr Ala Ala 115 120 125 Ser Asp Pro 130 41 464 PRTArabidopsis thaliana 41 Met Thr Ser Ser Leu Ser Leu His Ser Ser Phe ValPro Ser Phe Ala 1 5 10 15 Asp Leu Ser Asp Arg Gly Leu Ile Ser Lys AsnSer Pro Thr Ser Val 20 25 30 Ser Ile Ser Lys Val Pro Thr Trp Glu Lys LysGln Ile Ser Asn Arg 35 40 45 Asn Ser Phe Lys Leu Asn Cys Val Met Glu LysSer Val Asp Gly Gln 50 55 60 Thr His Ser Thr Val Asn Asn Thr Thr Asp SerLeu Asn Thr Met Asn 65 70 75 80 Ile Lys Glu Glu Ala Ser Val Ser Thr LeuLeu Val Asn Leu Asp Asn 85 90 95 Lys Phe Asp Pro Phe Asp Ala Met Ser ThrPro Leu Tyr Gln Thr Ala 100 105 110 Thr Phe Lys Gln Pro Ser Ala Ile GluAsn Gly Pro Tyr Asp Tyr Thr 115 120 125 Arg Ser Gly Asn Pro Thr Arg AspAla Leu Glu Ser Leu Leu Ala Lys 130 135 140 Leu Asp Lys Ala Asp Arg AlaPhe Cys Phe Thr Ser Gly Met Ala Ala 145 150 155 160 Leu Ser Ala Val ThrHis Leu Ile Lys Asn Gly Glu Glu Ile Val Ala 165 170 175 Gly Asp Asp ValTyr Gly Gly Ser Asp Arg Leu Leu Ser Gln Val Val 180 185 190 Pro Arg SerGly Val Val Val Lys Arg Val Asn Thr Thr Lys Leu Asp 195 200 205 Glu ValAla Ala Ala Ile Gly Pro Gln Thr Lys Leu Val Trp Leu Glu 210 215 220 SerPro Thr Asn Pro Arg Gln Gln Ile Ser Asp Ile Arg Lys Ile Ser 225 230 235240 Glu Met Ala His Ala Gln Gly Ala Leu Val Leu Val Asp Asn Ser Ile 245250 255 Met Ser Pro Val Leu Ser Arg Pro Leu Glu Leu Gly Ala Asp Ile Val260 265 270 Met His Ser Ala Thr Lys Phe Ile Ala Gly His Ser Asp Val MetAla 275 280 285 Gly Val Leu Ala Val Lys Gly Glu Lys Leu Ala Lys Glu ValTyr Phe 290 295 300 Leu Gln Asn Ser Glu Gly Ser Gly Leu Ala Pro Phe AspCys Trp Leu 305 310 315 320 Cys Leu Arg Gly Ile Lys Thr Met Ala Leu ArgIle Glu Lys Gln Gln 325 330 335 Glu Asn Ala Arg Lys Ile Ala Met Tyr LeuSer Ser His Pro Arg Val 340 345 350 Lys Lys Val Tyr Tyr Ala Gly Leu ProAsp His Pro Gly His His Leu 355 360 365 His Phe Ser Gln Ala Lys Gly AlaGly Ser Val Phe Ser Phe Ile Thr 370 375 380 Gly Ser Val Ala Leu Ser LysHis Leu Val Glu Thr Thr Lys Tyr Phe 385 390 395 400 Ser Ile Ala Val SerPhe Gly Ser Val Lys Ser Leu Ile Ser Met Pro 405 410 415 Cys Phe Met SerHis Ala Ser Ile Pro Ala Glu Val Arg Glu Ala Arg 420 425 430 Gly Leu ThrGlu Asp Leu Val Arg Ile Ser Ala Gly Ile Glu Asp Val 435 440 445 Asp AspLeu Ile Ser Asp Leu Asp Ile Ala Phe Lys Thr Phe Pro Leu 450 455 460 421113 DNA Zea mays 42 gccgtccagg acctcgcggc ccctggggcg ttcgacggcgtcgacatcgc gctattcagc 60 gccggcggga gcgtcagccg gaagtatggg cccgcggccgtcgccagcgg cgccgtagtt 120 gtcgacaaca gctccgcgtt ccggatggag cccgaggtgccgctcgtcat ccccgaggtc 180 aaccccgagg ccatggcgaa cgtccgcctc gggcagggggcgattgtggc aaatccgaat 240 tgctcgacca tcatctgcct catggctgcc acgccgctccatcgccacgc taaggtgtta 300 aggatggttg tcagcacata ccaagcagca agtggtgcgggtgctgcggc aatggaagaa 360 ctcaagctgc agactcagga ggtcttggaa gggaaggcgccaacatgcaa cattttcaaa 420 cagcagtatg cttttaatat attctcacac aatgcaccagttcttgagaa tgggtataac 480 gaggaggaaa tgaaaatggt gaaggagacc aggaaaatttggaatgacaa ggaggtgaaa 540 gtaactgcga cttgcatacg ggttcctgtg atgcgcgcacatgctgaaag tgtcaatcta 600 cagtttgaaa agccacttga tgaggatact gcaagagaaattttgagagc agctcctggt 660 gttaccatta ttgatgaccg agcttccaat cgctttcctacacctctgga ggtatcagac 720 aaagatgacg tagcagtggg taggattcgt caggacttgtccctggatgg taaccgaggg 780 ttggacatat ttgtgtgtgg tgatcagata cgtaaaggcgccgcactcaa tgccgttcag 840 attgctgaaa tgctgctgaa gtgaatgtga cctaaccctcttgtccctcc ctccctgtcc 900 ctaattgctc tgatcaaatg ctggactgta ctctgattagtttgtcctca attttggtcg 960 cctgttctgt attctgccgt gctagtgcaa taattgtgttatgggcttga gttatctgct 1020 gtacgcataa gtgggctcct aaactgggaa ataatgggccgtccttattc agcattccgg 1080 tttatatctt gttcaaaaaa aaaaaaaaaa ata 1113 43287 PRT Zea mays 43 Ala Val Gln Asp Leu Ala Ala Pro Gly Ala Phe Asp GlyVal Asp Ile 1 5 10 15 Ala Leu Phe Ser Ala Gly Gly Ser Val Ser Arg LysTyr Gly Pro Ala 20 25 30 Ala Val Ala Ser Gly Ala Val Val Val Asp Asn SerSer Ala Phe Arg 35 40 45 Met Glu Pro Glu Val Pro Leu Val Ile Pro Glu ValAsn Pro Glu Ala 50 55 60 Met Ala Asn Val Arg Leu Gly Gln Gly Ala Ile ValAla Asn Pro Asn 65 70 75 80 Cys Ser Thr Ile Ile Cys Leu Met Ala Ala ThrPro Leu His Arg His 85 90 95 Ala Lys Val Leu Arg Met Val Val Ser Thr TyrGln Ala Ala Ser Gly 100 105 110 Ala Gly Ala Ala Ala Met Glu Glu Leu LysLeu Gln Thr Gln Glu Val 115 120 125 Leu Glu Gly Lys Ala Pro Thr Cys AsnIle Phe Lys Gln Gln Tyr Ala 130 135 140 Phe Asn Ile Phe Ser His Asn AlaPro Val Leu Glu Asn Gly Tyr Asn 145 150 155 160 Glu Glu Glu Met Lys MetVal Lys Glu Thr Arg Lys Ile Trp Asn Asp 165 170 175 Lys Glu Val Lys ValThr Ala Thr Cys Ile Arg Val Pro Val Met Arg 180 185 190 Ala His Ala GluSer Val Asn Leu Gln Phe Glu Lys Pro Leu Asp Glu 195 200 205 Asp Thr AlaArg Glu Ile Leu Arg Ala Ala Pro Gly Val Thr Ile Ile 210 215 220 Asp AspArg Ala Ser Asn Arg Phe Pro Thr Pro Leu Glu Val Ser Asp 225 230 235 240Lys Asp Asp Val Ala Val Gly Arg Ile Arg Gln Asp Leu Ser Leu Asp 245 250255 Gly Asn Arg Gly Leu Asp Ile Phe Val Cys Gly Asp Gln Ile Arg Lys 260265 270 Gly Ala Ala Leu Asn Ala Val Gln Ile Ala Glu Met Leu Leu Lys 275280 285 44 1402 DNA Oryza sativa 44 gcccaactcc caaaacccta gaaccgcgccgccacaatgc aggccgccgc cgccgccgtc 60 caccgcccgc acctcctcgg cgcctaccccggcggtggcc gcgcgcgccg cccgtcgtcc 120 accgtgcgga tggcgcttcg ggaggacgggccgtcggtgg cgatcgtggg cgcgacgggc 180 gccgtcggcc aggagttcct ccgcgtcatctcctcccggg gcttccccta ccggagcctc 240 cgcctcctcg ccagcgagcg ctccgcggggaagcgcctcc cgttcgaggg ccaggagtac 300 accgtccagg acctcgccgc gccgggcgcgttcgacgggg tggacatcgc gctcttcagc 360 gccggcggcg gggtcagccg cgcccacgctcccgcggccg tcgccagcgg cgccgtcgtc 420 gtggacaaca gctccgcctt ccggatggaccccgaggtgc cgctcgtcat ccccgaggtc 480 aatcccgagg ccatggcgca cgtccggctgggaaaggggg ctattgtggc caacccgaac 540 tgttccacca tcatctgcct catggctgccacacctctgc accgccacgc caaggtggta 600 aggatggttg tcagcactta ccaagcagcaagtggtgctg gggctgcggc catggaagaa 660 ctcaaacttc aaactcaaga ggtcttggcggggaaagcac caacatgcaa cattttcagt 720 cagcagtatg cttttaatat attttcacataatgcaccaa ttgttgaaaa tgggtacaat 780 gaggaggaga tgaagatggt gaaggagaccagaaaaatct ggaatgataa agatgtgaag 840 gtaactgcaa cctgcatacg agttcctgtgatgcgtgcac atgctgaaag tgtgaatcta 900 cagtttgaaa agccacttga tgaggatactgcaagggaaa tcttgagggc agctgaaggt 960 gttaccatta ttgatgaccg tgcttccaatcgcttcccca cacctcttga ggtatcggat 1020 aaagatgatg tagcagtggg tagaattcgtcaggatttgt cgcaagatga taacaaaggg 1080 ctggacatat ttgtttgtgg agatcaaatacgtaaaggtg ctgcactcaa tgctgtgcag 1140 attgctgaaa tgctactcaa gtgattttcttttctgtacc tttctctcct tgcccctctt 1200 tgctctagtc attgtttgac ggatgtactctggttagtat gagatcaatt ttgatcatct 1260 tttgtaatct atattcctag tgaaataaatgtaaaacggt tttgctctat cttctgcaca 1320 agtgtagaag aaatctgaaa ttgggaaattggagtgtggc ccttgttcaa aaaaaaaaaa 1380 aaaaaaaaaa aaaaaaaaaa aa 1402 45375 PRT Oryza sativa 45 Met Gln Ala Ala Ala Ala Ala Val His Arg Pro HisLeu Leu Gly Ala 1 5 10 15 Tyr Pro Gly Gly Gly Arg Ala Arg Arg Pro SerSer Thr Val Arg Met 20 25 30 Ala Leu Arg Glu Asp Gly Pro Ser Val Ala IleVal Gly Ala Thr Gly 35 40 45 Ala Val Gly Gln Glu Phe Leu Arg Val Ile SerSer Arg Gly Phe Pro 50 55 60 Tyr Arg Ser Leu Arg Leu Leu Ala Ser Glu ArgSer Ala Gly Lys Arg 65 70 75 80 Leu Pro Phe Glu Gly Gln Glu Tyr Thr ValGln Asp Leu Ala Ala Pro 85 90 95 Gly Ala Phe Asp Gly Val Asp Ile Ala LeuPhe Ser Ala Gly Gly Gly 100 105 110 Val Ser Arg Ala His Ala Pro Ala AlaVal Ala Ser Gly Ala Val Val 115 120 125 Val Asp Asn Ser Ser Ala Phe ArgMet Asp Pro Glu Val Pro Leu Val 130 135 140 Ile Pro Glu Val Asn Pro GluAla Met Ala His Val Arg Leu Gly Lys 145 150 155 160 Gly Ala Ile Val AlaAsn Pro Asn Cys Ser Thr Ile Ile Cys Leu Met 165 170 175 Ala Ala Thr ProLeu His Arg His Ala Lys Val Val Arg Met Val Val 180 185 190 Ser Thr TyrGln Ala Ala Ser Gly Ala Gly Ala Ala Ala Met Glu Glu 195 200 205 Leu LysLeu Gln Thr Gln Glu Val Leu Ala Gly Lys Ala Pro Thr Cys 210 215 220 AsnIle Phe Ser Gln Gln Tyr Ala Phe Asn Ile Phe Ser His Asn Ala 225 230 235240 Pro Ile Val Glu Asn Gly Tyr Asn Glu Glu Glu Met Lys Met Val Lys 245250 255 Glu Thr Arg Lys Ile Trp Asn Asp Lys Asp Val Lys Val Thr Ala Thr260 265 270 Cys Ile Arg Val Pro Val Met Arg Ala His Ala Glu Ser Val AsnLeu 275 280 285 Gln Phe Glu Lys Pro Leu Asp Glu Asp Thr Ala Arg Glu IleLeu Arg 290 295 300 Ala Ala Glu Gly Val Thr Ile Ile Asp Asp Arg Ala SerAsn Arg Phe 305 310 315 320 Pro Thr Pro Leu Glu Val Ser Asp Lys Asp AspVal Ala Val Gly Arg 325 330 335 Ile Arg Gln Asp Leu Ser Gln Asp Asp AsnLys Gly Leu Asp Ile Phe 340 345 350 Val Cys Gly Asp Gln Ile Arg Lys GlyAla Ala Leu Asn Ala Val Gln 355 360 365 Ile Ala Glu Met Leu Leu Lys 370375 46 1391 DNA Glycine max 46 gcacgagctt cactctctgt tttgcgccacaaccacctct tctcgggccc cctcccggcc 60 cgccccaagc ccacctcctc ctcctcctccaggatccgaa tgtccctccg cgagaacggc 120 ccctccatcg ccgtcgtggg cgtcaccggcgccgtcggcc aggagttcct ctccgtcctc 180 tccgaccgcg acttccccta ccgctccattcatatgctgg cttccaagcg ctccgctggc 240 cgccgcatca ccttcgagga cagggactacgtcgtccagg agctcacgcc ggagagcttc 300 gacggtgtcg acatcgcgct cttcagcgccggcggctcca tcagcaagca cttcggcccc 360 atcgccgtca atcgtggaac ggtcgtggtcgacaacagct ccgcgtttcg gatgaacgag 420 aaggtgcctt tggtaattcc cgaagtgaaccccgaagcaa tgcaaaacat caaagccgga 480 acgggaaagg gcgcactcat tgctaaccctaattgctcca ccattatatg cttgatggct 540 gctacccctc ttcatcgacg tgccaaggtgttacgtatgg ttgttagtac ctatcaggct 600 gcgagtggtg ctggtgctgc tgcaatggaagagcttgagc tgcaaactcg tgaggtgttg 660 gaaggaaaac cacccacttg taaaatatttaaccgacagt atgcttttaa tctattctca 720 cataatgcgt ctgttctttc aaatggatataatgaagaag aaatgaaaat ggtcaaggag 780 accaggaaaa tctggaatga caaggatgttaaagtaactg ccacatgcat acgagttccc 840 atcatgcgag ctcatgctga gagtgtgaatcttcaatttg aaagacccct tgatgaggac 900 actgcaagag atattctgaa aaatgctccaggtgtagtgg ttattgatga tcgtgaatcc 960 aatcattttc ctactccact ggaagtgtcaaacaaggatg atgttgctgt tggtaggatt 1020 cggcaggacc tgtctcagga tgggaatcaagggttggaca tctttgtatg tggggatcaa 1080 attcgcaagg gagctgcact taacgcaatccagattgctg agatgttgct atgagttctg 1140 gtttttcaag gatctggtac ttaaagattatgcttctttt gaaacagttt tgtatgtgct 1200 agttgtatgt ggttattcat ttcttttgtgatgtttaact agtccaagta tcttttcaac 1260 gatgtggtag cacactagct ggaaacagtttttttaaggt cttggtgcgt aatatctgca 1320 atccttttca ccgggaataa caagcactggttatggcaaa aaaaaaaaaa aaaaaaaaaa 1380 aaaaaaaaaa a 1391 47 377 PRTGlycine max 47 Ala Arg Ala Ser Leu Ser Val Leu Arg His Asn His Leu PheSer Gly 1 5 10 15 Pro Leu Pro Ala Arg Pro Lys Pro Thr Ser Ser Ser SerSer Arg Ile 20 25 30 Arg Met Ser Leu Arg Glu Asn Gly Pro Ser Ile Ala ValVal Gly Val 35 40 45 Thr Gly Ala Val Gly Gln Glu Phe Leu Ser Val Leu SerAsp Arg Asp 50 55 60 Phe Pro Tyr Arg Ser Ile His Met Leu Ala Ser Lys ArgSer Ala Gly 65 70 75 80 Arg Arg Ile Thr Phe Glu Asp Arg Asp Tyr Val ValGln Glu Leu Thr 85 90 95 Pro Glu Ser Phe Asp Gly Val Asp Ile Ala Leu PheSer Ala Gly Gly 100 105 110 Ser Ile Ser Lys His Phe Gly Pro Ile Ala ValAsn Arg Gly Thr Val 115 120 125 Val Val Asp Asn Ser Ser Ala Phe Arg MetAsn Glu Lys Val Pro Leu 130 135 140 Val Ile Pro Glu Val Asn Pro Glu AlaMet Gln Asn Ile Lys Ala Gly 145 150 155 160 Thr Gly Lys Gly Ala Leu IleAla Asn Pro Asn Cys Ser Thr Ile Ile 165 170 175 Cys Leu Met Ala Ala ThrPro Leu His Arg Arg Ala Lys Val Leu Arg 180 185 190 Met Val Val Ser ThrTyr Gln Ala Ala Ser Gly Ala Gly Ala Ala Ala 195 200 205 Met Glu Glu LeuGlu Leu Gln Thr Arg Glu Val Leu Glu Gly Lys Pro 210 215 220 Pro Thr CysLys Ile Phe Asn Arg Gln Tyr Ala Phe Asn Leu Phe Ser 225 230 235 240 HisAsn Ala Ser Val Leu Ser Asn Gly Tyr Asn Glu Glu Glu Met Lys 245 250 255Met Val Lys Glu Thr Arg Lys Ile Trp Asn Asp Lys Asp Val Lys Val 260 265270 Thr Ala Thr Cys Ile Arg Val Pro Ile Met Arg Ala His Ala Glu Ser 275280 285 Val Asn Leu Gln Phe Glu Arg Pro Leu Asp Glu Asp Thr Ala Arg Asp290 295 300 Ile Leu Lys Asn Ala Pro Gly Val Val Val Ile Asp Asp Arg GluSer 305 310 315 320 Asn His Phe Pro Thr Pro Leu Glu Val Ser Asn Lys AspAsp Val Ala 325 330 335 Val Gly Arg Ile Arg Gln Asp Leu Ser Gln Asp GlyAsn Gln Gly Leu 340 345 350 Asp Ile Phe Val Cys Gly Asp Gln Ile Arg LysGly Ala Ala Leu Asn 355 360 365 Ala Ile Gln Ile Ala Glu Met Leu Leu 370375 48 1470 DNA Glycine max 48 gcacgaggtc tgttttaaaa tccaacacttaatctctctc ttcgcagcct aaaatcccaa 60 tggcttcact ctctgttttg cgccacaaccacctcttctc gggccccctc ccggcccgcc 120 ccaagcccac ctcctcctcc tcctccaggatccgaatgtc cctccgcgag aacggcccct 180 ccatcgccgt cgtgggcgtc accggcgccgtcggccagga gttcctctcc gtcctctccg 240 accgcgactt cccctaccgc tccattcatatgctggcttc caagcgctcc gctggccgcc 300 gcatcacctt cgaggacagg gactacgtcgtccaggagct cacgccggag agcttcgacg 360 gtgtcgacat cgcgctcttc agcgccggcggctccatcag caagcacttc ggccccatcg 420 ccgtcaatcg tggaacggtc gtggtcgacaacagctccgc gtttcggatg gacgagaagg 480 tgcctttggt aattcccgaa gtgaaccccgaagcaatgca aaacatcaaa gccggaacgg 540 gaaagggcgc actcattgct aaccctaattgctccaccat tagatgcttg aaggctgcta 600 cccctcttca tcgacgtgcc aaggtgttacgtatggttgt tagtacctat caggctgcga 660 gtggtgctgg tgctgctgca atggaagagcttgagctgca aactcgtgag gtgttggaag 720 gaaaaccacc cacttgtaaa atatttaaccgacagtatgc ttttaatcta ttctcacata 780 atgcgtctgt tctttcaaat ggatataatgaagaagaaat gaaaatggtc aaggagacca 840 ggaaaatctg gaatgacaag gatgttaaagtaactgccac atgcatacga gttcccatca 900 tgcgagctca tgctgagagt gtgaatcttcaatttgaaag accccttgat gaggacactg 960 caagagatat tctgaaaaat gctccaggtgtagtggttat tgatgatcgt gaatccaatc 1020 attttcctac tccactggaa gtgtcaaacaaggatgatgt tgctgttggt aggattcggc 1080 aggacctgtc tcaggatggg aatcaagggttggacatctt tgtatgtggg gatcaaattc 1140 gcaagggagc tgcacttaac gcaatccagattgctgagat gttgctatga gttctggttt 1200 ttcaaggatc tggtacttaa agattatgcttcttttgaaa cagttttgta tgtgctagtt 1260 gtatgtggtt attcatttct tttgtgatgtttaactagtc caagtatctt ttcaacgatg 1320 tggtagcaca ctagctggaa acagtttttttaaggtcttg gtgcgtaata tctgcaatcc 1380 ttttcaccgg gaataacaag cactggttttggcaaaaaaa aaaaaaaaaa aaaaaaaaaa 1440 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa1470 49 376 PRT Glycine max 49 Met Ala Ser Leu Ser Val Leu Arg His AsnHis Leu Phe Ser Gly Pro 1 5 10 15 Leu Pro Ala Arg Pro Lys Pro Thr SerSer Ser Ser Ser Arg Ile Arg 20 25 30 Met Ser Leu Arg Glu Asn Gly Pro SerIle Ala Val Val Gly Val Thr 35 40 45 Gly Ala Val Gly Gln Glu Phe Leu SerVal Leu Ser Asp Arg Asp Phe 50 55 60 Pro Tyr Arg Ser Ile His Met Leu AlaSer Lys Arg Ser Ala Gly Arg 65 70 75 80 Arg Ile Thr Phe Glu Asp Arg AspTyr Val Val Gln Glu Leu Thr Pro 85 90 95 Glu Ser Phe Asp Gly Val Asp IleAla Leu Phe Ser Ala Gly Gly Ser 100 105 110 Ile Ser Lys His Phe Gly ProIle Ala Val Asn Arg Gly Thr Val Val 115 120 125 Val Asp Asn Ser Ser AlaPhe Arg Met Asp Glu Lys Val Pro Leu Val 130 135 140 Ile Pro Glu Val AsnPro Glu Ala Met Gln Asn Ile Lys Ala Gly Thr 145 150 155 160 Gly Lys GlyAla Leu Ile Ala Asn Pro Asn Cys Ser Thr Ile Arg Cys 165 170 175 Leu LysAla Ala Thr Pro Leu His Arg Arg Ala Lys Val Leu Arg Met 180 185 190 ValVal Ser Thr Tyr Gln Ala Ala Ser Gly Ala Gly Ala Ala Ala Met 195 200 205Glu Glu Leu Glu Leu Gln Thr Arg Glu Val Leu Glu Gly Lys Pro Pro 210 215220 Thr Cys Lys Ile Phe Asn Arg Gln Tyr Ala Phe Asn Leu Phe Ser His 225230 235 240 Asn Ala Ser Val Leu Ser Asn Gly Tyr Asn Glu Glu Glu Met LysMet 245 250 255 Val Lys Glu Thr Arg Lys Ile Trp Asn Asp Lys Asp Val LysVal Thr 260 265 270 Ala Thr Cys Ile Arg Val Pro Ile Met Arg Ala His AlaGlu Ser Val 275 280 285 Asn Leu Gln Phe Glu Arg Pro Leu Asp Glu Asp ThrAla Arg Asp Ile 290 295 300 Leu Lys Asn Ala Pro Gly Val Val Val Ile AspAsp Arg Glu Ser Asn 305 310 315 320 His Phe Pro Thr Pro Leu Glu Val SerAsn Lys Asp Asp Val Ala Val 325 330 335 Gly Arg Ile Arg Gln Asp Leu SerGln Asp Gly Asn Gln Gly Leu Asp 340 345 350 Ile Phe Val Cys Gly Asp GlnIle Arg Lys Gly Ala Ala Leu Asn Ala 355 360 365 Ile Gln Ile Ala Glu MetLeu Leu 370 375 50 1609 DNA Triticum aestivum 50 caccaccacc cacctacccaaatcccagcc gccctaaaac cctaggccgc caaacccgcc 60 gccgccgccg ccgcaatgcaggccgccgca gccgtccacc ggccacacct cctcgcggcg 120 tccccgctcg ggggccgcgccagccgccgg ccctccacgg tccgcatggc gctccgcgag 180 gacgggccct ccgtggccatcgtgggcgcc accggcgcgg tggggcagga gttcctccgc 240 gtcatcaccg cccgcgacttcccctaccgc agcctgcgcc tcctcgccag cgagcgctcc 300 gcgggcaagc gcatcgacttcgagggccgg gactacaccg tccaggacct cgcggcgccg 360 ggggccttcg acggggtcgacatcgcgctc ttcagcgccg gcgggagcat cagccgcgcc 420 cacgcgcccg ccgccgtcgccagcggcgcc gtcgtcgtgg ataacagctc cgcctaccgg 480 atggaccccg acgtgccgctcgtcatcccg gaggttaacc ccgaggccat ggccgacgtc 540 cggctcggga aaggggctattgtggccaac cccaactgtt ccaccatcat ctgcctcatg 600 gctgtcacgc cgctgcatcgccacgccaag gtgaaaagga tggttgtcag cacataccaa 660 gcagcaagtg gtgctggtgctgcagccatg gaagaactca aacttcagac tcgagaggtc 720 ttggaaggaa agccaccaacctgtaacatt ttcagtcaac agtatgcttt taatatattt 780 tcgcataatg cacctattgttgaaaatggc tataatgagg aagagatgaa aatggtgaag 840 gagaccagaa aaatctggaatgacaaggat gtaagagtaa ctgcaacttg tatacgggtt 900 cctacgatgc gcgcgcatgccgaaagcgtg aatctacagt ttgaaaagcc acttgatgag 960 gacactgcca gagaaatcttgagggcagct cctggtgtta ccattagtga cgaccgtgct 1020 gccaaccgct tccctacaccactggaggta tcggataaag atgacgtatc agttggtagg 1080 attcgccagg acttgtcacaagatgataac agagggttgg agttatttgt ctgtggagac 1140 cagatacgta aaggcgccgcgctgaacgct gtgcagattg ctgaaatgct actgaagtga 1200 ccgccttttt accattgtctcatgtgccac gttgctctat ccattgatgg attgatgtac 1260 tctagtcact ttcaacccagttttggtcgt cgtctttttt gtaatctgtc aacctagcag 1320 aagaagtgta agacgggctttagtcatctg ttgcacacaa aagtgcagcc acaagtttag 1380 aaaaggaggg ttttcacttgttcggatttt gccttaggtt ggactttgtt gcaagtttgt 1440 cgtttgtttc ttgaaagctggtctgctgta actttacccc caaagccctc gagataacga 1500 ggcgtcctgt ggggacctaaaaaaaaaaaa aaaaaaaaaa aaaaaacccc aaaaaaaaaa 1560 aaaaaaaaaa aaaaaaaaaaaaaaaaaaaa aaaaaaaaaa aaaaaaaaa 1609 51 374 PRT Triticum aestivum 51 MetGln Ala Ala Ala Ala Val His Arg Pro His Leu Leu Ala Ala Ser 1 5 10 15Pro Leu Gly Gly Arg Ala Ser Arg Arg Pro Ser Thr Val Arg Met Ala 20 25 30Leu Arg Glu Asp Gly Pro Ser Val Ala Ile Val Gly Ala Thr Gly Ala 35 40 45Val Gly Gln Glu Phe Leu Arg Val Ile Thr Ala Arg Asp Phe Pro Tyr 50 55 60Arg Ser Leu Arg Leu Leu Ala Ser Glu Arg Ser Ala Gly Lys Arg Ile 65 70 7580 Asp Phe Glu Gly Arg Asp Tyr Thr Val Gln Asp Leu Ala Ala Pro Gly 85 9095 Ala Phe Asp Gly Val Asp Ile Ala Leu Phe Ser Ala Gly Gly Ser Ile 100105 110 Ser Arg Ala His Ala Pro Ala Ala Val Ala Ser Gly Ala Val Val Val115 120 125 Asp Asn Ser Ser Ala Tyr Arg Met Asp Pro Asp Val Pro Leu ValIle 130 135 140 Pro Glu Val Asn Pro Glu Ala Met Ala Asp Val Arg Leu GlyLys Gly 145 150 155 160 Ala Ile Val Ala Asn Pro Asn Cys Ser Thr Ile IleCys Leu Met Ala 165 170 175 Val Thr Pro Leu His Arg His Ala Lys Val LysArg Met Val Val Ser 180 185 190 Thr Tyr Gln Ala Ala Ser Gly Ala Gly AlaAla Ala Met Glu Glu Leu 195 200 205 Lys Leu Gln Thr Arg Glu Val Leu GluGly Lys Pro Pro Thr Cys Asn 210 215 220 Ile Phe Ser Gln Gln Tyr Ala PheAsn Ile Phe Ser His Asn Ala Pro 225 230 235 240 Ile Val Glu Asn Gly TyrAsn Glu Glu Glu Met Lys Met Val Lys Glu 245 250 255 Thr Arg Lys Ile TrpAsn Asp Lys Asp Val Arg Val Thr Ala Thr Cys 260 265 270 Ile Arg Val ProThr Met Arg Ala His Ala Glu Ser Val Asn Leu Gln 275 280 285 Phe Glu LysPro Leu Asp Glu Asp Thr Ala Arg Glu Ile Leu Arg Ala 290 295 300 Ala ProGly Val Thr Ile Ser Asp Asp Arg Ala Ala Asn Arg Phe Pro 305 310 315 320Thr Pro Leu Glu Val Ser Asp Lys Asp Asp Val Ser Val Gly Arg Ile 325 330335 Arg Gln Asp Leu Ser Gln Asp Asp Asn Arg Gly Leu Glu Leu Phe Val 340345 350 Cys Gly Asp Gln Ile Arg Lys Gly Ala Ala Leu Asn Ala Val Gln Ile355 360 365 Ala Glu Met Leu Leu Lys 370 52 340 PRT Aquifex aeolicus 52Met Gly Tyr Arg Val Ala Ile Val Gly Ala Thr Gly Glu Val Gly Arg 1 5 1015 Thr Phe Leu Lys Val Leu Glu Glu Arg Asn Phe Pro Val Asp Glu Leu 20 2530 Val Leu Tyr Ala Ser Glu Arg Ser Glu Gly Lys Val Leu Thr Phe Lys 35 4045 Gly Lys Glu Tyr Thr Val Lys Ala Leu Asn Lys Glu Asn Ser Phe Lys 50 5560 Gly Ile Asp Ile Ala Leu Phe Ser Ala Gly Gly Ser Thr Ser Lys Glu 65 7075 80 Trp Ala Pro Lys Phe Ala Lys Asp Gly Val Val Val Ile Asp Asn Ser 8590 95 Ser Ala Trp Arg Met Asp Pro Asp Val Pro Leu Val Val Pro Glu Val100 105 110 Asn Pro Glu Asp Val Lys Asp Phe Lys Lys Lys Gly Ile Ile AlaAsn 115 120 125 Pro Asn Cys Ser Thr Ile Gln Met Val Val Ala Leu Lys ProIle Tyr 130 135 140 Asp Lys Ala Gly Ile Lys Arg Val Val Val Ser Thr TyrGln Ala Val 145 150 155 160 Ser Gly Ala Gly Ala Lys Ala Ile Glu Asp LeuLys Asn Gln Thr Lys 165 170 175 Ala Trp Cys Glu Gly Lys Glu Met Pro LysAla Gln Lys Phe Pro His 180 185 190 Gln Ile Ala Phe Asn Ala Leu Pro HisIle Asp Val Phe Phe Glu Asp 195 200 205 Gly Tyr Thr Lys Glu Glu Asn LysMet Leu Tyr Glu Thr Arg Lys Ile 210 215 220 Met His Asp Glu Asn Ile LysVal Ser Ala Thr Cys Val Arg Ile Pro 225 230 235 240 Val Phe Tyr Gly HisSer Glu Ser Ile Ser Met Glu Thr Glu Lys Glu 245 250 255 Ile Ser Pro GluGlu Ala Arg Glu Val Leu Lys Asn Ala Pro Gly Val 260 265 270 Ile Val IleAsp Asn Pro Gln Asn Asn Glu Tyr Pro Met Pro Ile Met 275 280 285 Ala GluGly Arg Asp Glu Val Phe Val Gly Arg Ile Arg Lys Asp Arg 290 295 300 ValPhe Glu Pro Gly Leu Ser Met Trp Val Val Ala Asp Asn Ile Arg 305 310 315320 Lys Gly Ala Ala Thr Asn Ala Val Gln Ile Ala Glu Leu Leu Val Lys 325330 335 Glu Gly Leu Ile 340 53 1727 DNA Glycine max 53 ttgcaacacacattgtcttg tcggcaaaat cttccaccaa caacacacag ccatggcagg 60 ctcaaacattctttctcact ctccttccct tcccaaaacc tacagccact ccttaaacca 120 aaacgcgttatcccaaaagc ttttttttct gcccctcaaa ttcaaagcca ccacaaaacc 180 acgtgctctcagagcggttc tctcgcagaa cgctgtcaaa acctcggtgg aggacacaaa 240 gaacgctcattttcagcact gtttcaccaa atccgaagat gggtatctgt actgtgaggg 300 cctcaaggtgcatgacatca tggaatctgt tgagagaaga cctttctatt tgtacagcaa 360 gccccagataactaggaatg ttgaagccta caaggatgca ttggaagggt tgaactccat 420 aattggttatgccattaagg ccaataataa cttgaagatt ttggaacatt tgaggcactt 480 gggttgtggtgctgtgcttg ttagtgggaa tgagctgaag ttggctcttc gagctggctt 540 tgatcccacaaggtgtatct ttaatgggaa tgggaaaatc ttggaggatt tggtcttggc 600 tgctcaggaaggtgtgtttg tcaacattga tagtgagttt gacttggaaa acattgtaga 660 ggctgcaaaaagggctggga agaaggtcaa tgttttactt cggattaatc ctgatgtgga 720 tccacaggttcatccttatg ttgccactgg gaataagaac tctaaatttg gcattagaaa 780 tgagaagctgcagtgctttt tagatgcagt gaaggaacat cctaatgagc tcaaacttgt 840 aggggcccactgccatcttg gttcaacaat taccaaggtt gacattttca gggatgcagc 900 caccattatgatcaactaca ttgaccaaat ccgagatcag ggttttgaag ttgattactt 960 aaatattggtggaggacttg ggatagatta ttatcattct ggtgccatcc ttcctacacc 1020 tagagatctcattgacactg tacgagatct tgttatttca cgtggtctta atctcatcat 1080 tgaaccaggaagatcactca ttgcaaacac gtgttgctta gttaaccggg tgacaggtgt 1140 taaaactaatggatctaaaa acttcattgt aattgatgga agtatggctg aacttatccg 1200 ccctagtctttatgatgctt accagcatat agagctggtt tcccctgccc cgtcaaatgc 1260 tgaaacagaaacttttgatg tggttggccc tgtctgtgag tctgcagatt tcttaggaaa 1320 aggaagagaacttcctactc cagccaaggg tactggtttg gttgttcatg atgctggtgc 1380 ttattgcatgagcatggcat caacctacaa tctaaagatg cggcctcctg agtattgggt 1440 tgaagatgatggatcagtga gcaaaataag acatggagag acttttgaag accacattcg 1500 gttttttgaggggctttgag ctaataattt atcttgtagg aaagaaggct ggagaattgt 1560 tatgtacttggagtttgaat ctttcctcgt caatgaatgc atgactcttg tagttctgtt 1620 tcttccgttctaattgaatg ttgactccca tgacaggaac agagaataaa gttgatttca 1680 gttaaaaaaaaaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaa 1727 54 505 PRT Glycine max 54Cys Asn Thr His Cys Leu Val Gly Lys Ile Phe His Gln Gln His Thr 1 5 1015 Ala Met Ala Gly Ser Asn Ile Leu Ser His Ser Pro Ser Leu Pro Lys 20 2530 Thr Tyr Ser His Ser Leu Asn Gln Asn Ala Leu Ser Gln Lys Leu Phe 35 4045 Phe Leu Pro Leu Lys Phe Lys Ala Thr Thr Lys Pro Arg Ala Leu Arg 50 5560 Ala Val Leu Ser Gln Asn Ala Val Lys Thr Ser Val Glu Asp Thr Lys 65 7075 80 Asn Ala His Phe Gln His Cys Phe Thr Lys Ser Glu Asp Gly Tyr Leu 8590 95 Tyr Cys Glu Gly Leu Lys Val His Asp Ile Met Glu Ser Val Glu Arg100 105 110 Arg Pro Phe Tyr Leu Tyr Ser Lys Pro Gln Ile Thr Arg Asn ValGlu 115 120 125 Ala Tyr Lys Asp Ala Leu Glu Gly Leu Asn Ser Ile Ile GlyTyr Ala 130 135 140 Ile Lys Ala Asn Asn Asn Leu Lys Ile Leu Glu His LeuArg His Leu 145 150 155 160 Gly Cys Gly Ala Val Leu Val Ser Gly Asn GluLeu Lys Leu Ala Leu 165 170 175 Arg Ala Gly Phe Asp Pro Thr Arg Cys IlePhe Asn Gly Asn Gly Lys 180 185 190 Ile Leu Glu Asp Leu Val Leu Ala AlaGln Glu Gly Val Phe Val Asn 195 200 205 Ile Asp Ser Glu Phe Asp Leu GluAsn Ile Val Glu Ala Ala Lys Arg 210 215 220 Ala Gly Lys Lys Val Asn ValLeu Leu Arg Ile Asn Pro Asp Val Asp 225 230 235 240 Pro Gln Val His ProTyr Val Ala Thr Gly Asn Lys Asn Ser Lys Phe 245 250 255 Gly Ile Arg AsnGlu Lys Leu Gln Cys Phe Leu Asp Ala Val Lys Glu 260 265 270 His Pro AsnGlu Leu Lys Leu Val Gly Ala His Cys His Leu Gly Ser 275 280 285 Thr IleThr Lys Val Asp Ile Phe Arg Asp Ala Ala Thr Ile Met Ile 290 295 300 AsnTyr Ile Asp Gln Ile Arg Asp Gln Gly Phe Glu Val Asp Tyr Leu 305 310 315320 Asn Ile Gly Gly Gly Leu Gly Ile Asp Tyr Tyr His Ser Gly Ala Ile 325330 335 Leu Pro Thr Pro Arg Asp Leu Ile Asp Thr Val Arg Asp Leu Val Ile340 345 350 Ser Arg Gly Leu Asn Leu Ile Ile Glu Pro Gly Arg Ser Leu IleAla 355 360 365 Asn Thr Cys Cys Leu Val Asn Arg Val Thr Gly Val Lys ThrAsn Gly 370 375 380 Ser Lys Asn Phe Ile Val Ile Asp Gly Ser Met Ala GluLeu Ile Arg 385 390 395 400 Pro Ser Leu Tyr Asp Ala Tyr Gln His Ile GluLeu Val Ser Pro Ala 405 410 415 Pro Ser Asn Ala Glu Thr Glu Thr Phe AspVal Val Gly Pro Val Cys 420 425 430 Glu Ser Ala Asp Phe Leu Gly Lys GlyArg Glu Leu Pro Thr Pro Ala 435 440 445 Lys Gly Thr Gly Leu Val Val HisAsp Ala Gly Ala Tyr Cys Met Ser 450 455 460 Met Ala Ser Thr Tyr Asn LeuLys Met Arg Pro Pro Glu Tyr Trp Val 465 470 475 480 Glu Asp Asp Gly SerVal Ser Lys Ile Arg His Gly Glu Thr Phe Glu 485 490 495 Asp His Ile ArgPhe Phe Glu Gly Leu 500 505 55 858 DNA Triticum aestivum 55 tttgagttggagtacctgaa tattggaggt ggtttgggga tagactacca ccacactggt 60 gcagtcttgcctacacctat ggatcttatc aacactgtcc gggaattggt cctctcacgg 120 gatcttactctcattattga acctggaaga tccctgatcg ccaatacttg ctgcttcgtc 180 aataaggtcactggtgtaaa atcgaatggc acgaagaatt tcattgtagt tgatggcagc 240 atggccgagctcatcaggcc tagtctatat ggagcatatc agcatataga actagtttct 300 ccctctccaggtgcagaagt agcaaccttc gatattgttg ggccagtctg cgaatctgca 360 gatttccttggcaaagacag ggagcttcca acacctgaca agggagctgg tttggttgtc 420 cacgacgcaggagcctactg catgagcatg gcttcgacct acaacctgaa gatgaggcca 480 gccgagtattgggtagagga cgatgggtcc attgttaaga tcaggcacgg tgaaacattt 540 gacgactacatgaagttctt tgatggtctt cctgcctagg cccttttatc ttgttttggg 600 caagcgtagcccttttcatt tgatgagcgc atctcgtgga agattcgtgt gggaaaacta 660 ttcacttgtttgttatgtgg gtcatcccca tcaagcatgg gggtttttat ttgttagaat 720 agagtccaacaagtttagtg attgtagaga ttgaatggac ttactgcatt gttatcaatt 780 cttgtttatactatataaag ggtccgactc ctcccaataa agttaaagaa tattgttgtt 840 tacttttatctaaaaaaa 858 56 192 PRT Triticum aestivum 56 Phe Glu Leu Glu Tyr Leu AsnIle Gly Gly Gly Leu Gly Ile Asp Tyr 1 5 10 15 His His Thr Gly Ala ValLeu Pro Thr Pro Met Asp Leu Ile Asn Thr 20 25 30 Val Arg Glu Leu Val LeuSer Arg Asp Leu Thr Leu Ile Ile Glu Pro 35 40 45 Gly Arg Ser Leu Ile AlaAsn Thr Cys Cys Phe Val Asn Lys Val Thr 50 55 60 Gly Val Lys Ser Asn GlyThr Lys Asn Phe Ile Val Val Asp Gly Ser 65 70 75 80 Met Ala Glu Leu IleArg Pro Ser Leu Tyr Gly Ala Tyr Gln His Ile 85 90 95 Glu Leu Val Ser ProSer Pro Gly Ala Glu Val Ala Thr Phe Asp Ile 100 105 110 Val Gly Pro ValCys Glu Ser Ala Asp Phe Leu Gly Lys Asp Arg Glu 115 120 125 Leu Pro ThrPro Asp Lys Gly Ala Gly Leu Val Val His Asp Ala Gly 130 135 140 Ala TyrCys Met Ser Met Ala Ser Thr Tyr Asn Leu Lys Met Arg Pro 145 150 155 160Ala Glu Tyr Trp Val Glu Asp Asp Gly Ser Ile Val Lys Ile Arg His 165 170175 Gly Glu Thr Phe Asp Asp Tyr Met Lys Phe Phe Asp Gly Leu Pro Ala 180185 190 57 526 PRT Arabidopsis thaliana 57 Met Gly Gln Thr Asn Ser GluThr Gln Gln Ala Arg Leu Tyr Thr Gln 1 5 10 15 Asn Ser Gln Lys Gln LeuLeu Arg Ser Phe Leu Leu Leu His Leu Ile 20 25 30 Phe Gly Tyr Gln Ser HisLys Thr Leu Arg Met Ala Ala Ala Thr Gln 35 40 45 Phe Leu Ser Gln Pro SerSer Leu Asn Pro His Gln Leu Lys Asn Gln 50 55 60 Thr Ser Gln Arg Ser ArgSer Ile Pro Val Leu Ser Leu Lys Ser Thr 65 70 75 80 Leu Lys Pro Leu LysArg Leu Ser Val Lys Ala Ala Val Val Ser Gln 85 90 95 Asn Ser Ser Lys ThrVal Thr Lys Phe Asp His Cys Phe Lys Lys Ser 100 105 110 Ser Asp Gly PheLeu Tyr Cys Glu Gly Thr Lys Val Glu Asp Ile Met 115 120 125 Glu Ser ValGlu Arg Arg Pro Phe Tyr Leu Tyr Ser Lys Pro Gln Ile 130 135 140 Thr ArgAsn Leu Glu Ala Tyr Lys Glu Ala Leu Glu Gly Val Ser Ser 145 150 155 160Val Ile Gly Tyr Ala Ile Lys Ala Asn Asn Asn Leu Lys Ile Leu Glu 165 170175 His Leu Arg Ser Leu Gly Cys Gly Ala Val Leu Val Ser Gly Asn Glu 180185 190 Leu Arg Leu Ala Leu Arg Ala Gly Phe Asp Pro Thr Lys Cys Ile Phe195 200 205 Asn Gly Asn Gly Lys Ser Leu Glu Asp Leu Val Leu Ala Ala GlnGlu 210 215 220 Gly Val Phe Val Asn Val Asp Ser Glu Phe Asp Leu Asn AsnIle Val 225 230 235 240 Glu Ala Ser Arg Ile Ser Gly Lys Gln Val Asn ValLeu Leu Arg Ile 245 250 255 Asn Pro Asp Val Asp Pro Gln Val His Pro TyrVal Ala Thr Gly Asn 260 265 270 Lys Asn Ser Lys Phe Gly Ile Arg Asn GluLys Leu Gln Trp Phe Leu 275 280 285 Asp Gln Val Lys Ala His Pro Lys GluLeu Lys Leu Val Gly Ala His 290 295 300 Cys His Leu Gly Ser Thr Ile ThrLys Val Asp Ile Phe Arg Asp Ala 305 310 315 320 Ala Val Leu Met Ile GluTyr Ile Asp Glu Ile Arg Arg Gln Gly Phe 325 330 335 Glu Val Ser Tyr LeuAsn Ile Gly Gly Gly Leu Gly Ile Asp Tyr Tyr 340 345 350 His Ala Gly AlaVal Leu Pro Thr Pro Met Asp Leu Ile Asn Thr Val 355 360 365 Arg Glu LeuVal Leu Ser Arg Asp Leu Asn Leu Ile Ile Glu Pro Gly 370 375 380 Arg SerLeu Ile Ala Asn Thr Cys Cys Phe Val Asn His Val Thr Gly 385 390 395 400Val Lys Thr Asn Gly Thr Lys Asn Phe Ile Val Ile Asp Gly Ser Met 405 410415 Ala Glu Leu Ile Arg Pro Ser Leu Tyr Asp Ala Tyr Gln His Ile Glu 420425 430 Leu Val Ser Pro Pro Pro Ala Glu Ala Glu Val Thr Lys Phe Asp Val435 440 445 Val Gly Pro Val Cys Glu Ser Ala Asp Phe Leu Gly Lys Asp ArgGlu 450 455 460 Leu Pro Thr Pro Pro Gln Gly Ala Gly Leu Val Val His AspAla Gly 465 470 475 480 Ala Tyr Cys Met Ser Met Ala Ser Thr Tyr Asn LeuLys Met Arg Pro 485 490 495 Pro Glu Tyr Trp Val Glu Glu Asp Gly Ser IleThr Lys Ile Arg His 500 505 510 Ala Glu Thr Phe Asp Asp His Leu Arg PhePhe Glu Gly Leu 515 520 525 58 1143 DNA Oryza sativa 58 gcacgaggtcgccgccatcg ctgcccttcg cgccctcgat gtcaagtccc acgccgtctc 60 catccacctcaccaagggcc tccccctcgg ctccggcctc ggctcctccg ccgcctccgc 120 cgccgccgctgccaaggccg ttgacgccct cttcggctcc ctcctacacc aagatgacct 180 cgtcctcgcgggcctcgagt ccgagaaagc cgtcagtggc ttccacgccg acaacatcgc 240 cccggccatcctcggcggct tcgtcctcgt ccgcagctac gaccccttcc acctcatccc 300 gctctcctccccacctgccc tccgcctcca cttcgtcctc gtcacgcccg acttcgaggc 360 gcccaccagcaagatgcgtg ccgcgctgcc caaacaggtg gccgtccacc agcacgtccg 420 caactccagccaagcggccg cgcttgtcgc cgctgtgctg caaggggacg ccaccctcat 480 cggctccgcaatgtcctccg acggcatcgt ggagccaacc agggcgccgc tgattcctgg 540 catggctgcggtcaaggccg cggcgttgga agctggggca ttgggctgca ccatcagtgg 600 agcagggccaactgctgtgg ctgtcattga cggggaggag aagggcgagg aggttggccg 660 gaggatggtggaggcattcg ccaatgccgg caatctcaaa gcaacagcta ctgttgctca 720 gctcgatagagttggtgcca gggttatctc tacctccact ttggagtagg aagatctggg 780 aggactgctccggtaggtca aatttggaat ggctcacatg gacactagtg ggaggagaag 840 aaggggggattggtgtgttt tgtaattcct gggctgacca gaacgattgt cagtcagttg 900 ggttgtgaattgtgtgatgt agtagcaaac tgattcgtgc cggcaattga attgcaataa 960 gctagtggttgcagcatcac ctggcgaggc gtagctagga gatgcagaaa cagcattttg 1020 acatgtgtgggtgttgacat gcaacgaata aaatgaatga agctgaattg gggtttaaaa 1080 aaaaaaaaaaaaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaata 1140 aaa 1143 59255 PRT Oryza sativa 59 His Glu Val Ala Ala Ile Ala Ala Leu Arg Ala LeuAsp Val Lys Ser 1 5 10 15 His Ala Val Ser Ile His Leu Thr Lys Gly LeuPro Leu Gly Ser Gly 20 25 30 Leu Gly Ser Ser Ala Ala Ser Ala Ala Ala AlaAla Lys Ala Val Asp 35 40 45 Ala Leu Phe Gly Ser Leu Leu His Gln Asp AspLeu Val Leu Ala Gly 50 55 60 Leu Glu Ser Glu Lys Ala Val Ser Gly Phe HisAla Asp Asn Ile Ala 65 70 75 80 Pro Ala Ile Leu Gly Gly Phe Val Leu ValArg Ser Tyr Asp Pro Phe 85 90 95 His Leu Ile Pro Leu Ser Ser Pro Pro AlaLeu Arg Leu His Phe Val 100 105 110 Leu Val Thr Pro Asp Phe Glu Ala ProThr Ser Lys Met Arg Ala Ala 115 120 125 Leu Pro Lys Gln Val Ala Val HisGln His Val Arg Asn Ser Ser Gln 130 135 140 Ala Ala Ala Leu Val Ala AlaVal Leu Gln Gly Asp Ala Thr Leu Ile 145 150 155 160 Gly Ser Ala Met SerSer Asp Gly Ile Val Glu Pro Thr Arg Ala Pro 165 170 175 Leu Ile Pro GlyMet Ala Ala Val Lys Ala Ala Ala Leu Glu Ala Gly 180 185 190 Ala Leu GlyCys Thr Ile Ser Gly Ala Gly Pro Thr Ala Val Ala Val 195 200 205 Ile AspGly Glu Glu Lys Gly Glu Glu Val Gly Arg Arg Met Val Glu 210 215 220 AlaPhe Ala Asn Ala Gly Asn Leu Lys Ala Thr Ala Thr Val Ala Gln 225 230 235240 Leu Asp Arg Val Gly Ala Arg Val Ile Ser Thr Ser Thr Leu Glu 245 250255 60 370 PRT Arabidopsis thaliana 60 Met Ala Ser Leu Cys Phe Gln SerPro Ser Lys Pro Ile Ser Tyr Phe 1 5 10 15 Gln Pro Lys Ser Asn Pro SerPro Pro Leu Phe Ala Lys Val Ser Val 20 25 30 Phe Arg Cys Arg Ala Ser ValGln Thr Leu Val Ala Val Glu Pro Glu 35 40 45 Pro Val Phe Val Ser Val LysThr Phe Ala Pro Ala Thr Val Ala Asn 50 55 60 Leu Gly Pro Gly Phe Asp PheLeu Gly Cys Ala Val Asp Gly Leu Gly 65 70 75 80 Asp His Val Thr Leu ArgVal Asp Pro Ser Val Arg Ala Gly Glu Val 85 90 95 Ser Ile Ser Glu Ile ThrGly Thr Thr Thr Lys Leu Ser Thr Asn Pro 100 105 110 Leu Arg Asn Cys AlaGly Ile Ala Ala Ile Ala Thr Met Lys Met Leu 115 120 125 Gly Ile Arg SerVal Gly Leu Ser Leu Asp Leu His Lys Gly Leu Pro 130 135 140 Leu Gly SerGly Leu Gly Ser Ser Ala Ala Ser Ala Ala Ala Ala Ala 145 150 155 160 ValAla Val Asn Glu Ile Phe Gly Arg Lys Leu Gly Ser Asp Gln Leu 165 170 175Val Leu Ala Gly Leu Glu Ser Glu Ala Lys Val Ser Gly Tyr His Ala 180 185190 Asp Asn Ile Ala Pro Ala Ile Met Gly Gly Phe Val Leu Ile Arg Asn 195200 205 Tyr Glu Pro Leu Asp Leu Lys Pro Leu Lys Phe Pro Ser Asp Lys Asp210 215 220 Leu Phe Phe Val Leu Val Ser Pro Glu Phe Glu Ala Pro Thr LysLys 225 230 235 240 Met Arg Ala Ala Leu Pro Thr Glu Ile Pro Met Val HisHis Val Trp 245 250 255 Asn Ser Ser Gln Ala Ala Ala Leu Val Ala Ala ValLeu Glu Gly Asp 260 265 270 Ala Val Met Leu Gly Lys Ala Leu Ser Ser AspLys Ile Val Glu Pro 275 280 285 Thr Arg Ala Pro Leu Ile Pro Gly Met GluAla Val Lys Lys Ala Ala 290 295 300 Leu Glu Ala Gly Ala Phe Gly Cys ThrIle Ser Gly Ala Gly Pro Thr 305 310 315 320 Ala Val Ala Val Ile Asp SerGlu Glu Lys Gly Gln Val Ile Gly Glu 325 330 335 Lys Met Val Glu Ala PheTrp Lys Val Gly His Leu Lys Ser Val Ala 340 345 350 Ser Val Lys Lys LeuAsp Lys Val Gly Ala Arg Leu Val Asn Ser Val 355 360 365 Ser Arg 370 611508 DNA Zea mays 61 aaggatggcg tcgtggtcgt cgccctcagc cgccgccaacgccgcctcgg gcgcccgatt 60 cggccccttc ccgagcggag ggcagcggct cgcgccgtgtccgtcgctcg tccgcggaac 120 tcccgccccg acgctcgtcc tcaggctcca cccggacggccgtggccatg gcctcctcgc 180 gcacaccggc ccctctccct cctcgcggtg ccgcgccgtcgccgccgagg tcgggggcct 240 caacatcgcc aacgacgtca cccagctcat cggcaacacaccaatggtgt atctcaacaa 300 cgtcgtcaag ggctctgtcg ccaatgtcgc tgctaagctcgagattatgg agccctgctg 360 tagcgtcaag gacaggatag ggtacagtat gataaatgatgctgaacaga agggcttgat 420 tactcctgga aagagtgttt tggtggaagc aacaagtggaaacacaggca ttggtcttgc 480 tttcattgct gcttccaaag gatataagct gatactaacaatgccttcat caatgagcat 540 ggagagaaga gtcctcctta gagcttttgg tgccgaacttgtccttactg atgctgcaaa 600 agggatgaaa ggggccttag ataaggctac agagattttaaacaagacac caaattctta 660 catgcttcaa cagttcgata accctgccaa ccctcaggtacattatgaga ctactggtcc 720 agagatctgg gaggattcaa aggggaaggt ggatatattcattggtggaa ttggaacagg 780 ggggacaata tctggtgccg gccgttttct caaggagaaaaatcctggaa ttaaggttat 840 tggtattgag ccttctgaaa gtaacatact ctccggtggaaaacctggtc cacataagat 900 ccagggaatc ggcgcaggat ttgttccaag gaacttggatagcgatattc ttgatgaagt 960 aattgagata tcaagtgatg aagctgttga gacagcaaaacagttggctg ttcaggaagg 1020 attactggtt ggaatctcct ctggagcagc cgccgctgctgccataaagg ttgccaaaag 1080 accagagaat gctggaaagc tgatagtggt tgtgtttccgagcttcggcg agaggtacct 1140 ttcatctgtc ctctatcagt ccataagaga agaatgtgagaacatgcaac ctgagccatg 1200 agggagccgt cactttaagc gggcatagta aatgtttctgaaataagacg cgtagccagc 1260 atcagtttgc tccacttgga atcatttggc catgctcactctatcctttc gctagcctct 1320 atgaccggac ctaaactggt gtgtgagaaa catccacgactgtcctccca actgctttcc 1380 taaagccaaa cgataacact ctcaataatt gtctatacgattgaagctga tttgattggt 1440 aattgtaaac agcttgtctt tggatctttg aagtcaaacaaagtcagttg gttgaatcaa 1500 aaaaaaaa 1508 62 398 PRT Zea mays 62 Met AlaSer Trp Ser Ser Pro Ser Ala Ala Ala Asn Ala Ala Ser Gly 1 5 10 15 AlaArg Phe Gly Pro Phe Pro Ser Gly Gly Gln Arg Leu Ala Pro Cys 20 25 30 ProSer Leu Val Arg Gly Thr Pro Ala Pro Thr Leu Val Leu Arg Leu 35 40 45 HisPro Asp Gly Arg Gly His Gly Leu Leu Ala His Thr Gly Pro Ser 50 55 60 ProSer Ser Arg Cys Arg Ala Val Ala Ala Glu Val Gly Gly Leu Asn 65 70 75 80Ile Ala Asn Asp Val Thr Gln Leu Ile Gly Asn Thr Pro Met Val Tyr 85 90 95Leu Asn Asn Val Val Lys Gly Ser Val Ala Asn Val Ala Ala Lys Leu 100 105110 Glu Ile Met Glu Pro Cys Cys Ser Val Lys Asp Arg Ile Gly Tyr Ser 115120 125 Met Ile Asn Asp Ala Glu Gln Lys Gly Leu Ile Thr Pro Gly Lys Ser130 135 140 Val Leu Val Glu Ala Thr Ser Gly Asn Thr Gly Ile Gly Leu AlaPhe 145 150 155 160 Ile Ala Ala Ser Lys Gly Tyr Lys Leu Ile Leu Thr MetPro Ser Ser 165 170 175 Met Ser Met Glu Arg Arg Val Leu Leu Arg Ala PheGly Ala Glu Leu 180 185 190 Val Leu Thr Asp Ala Ala Lys Gly Met Lys GlyAla Leu Asp Lys Ala 195 200 205 Thr Glu Ile Leu Asn Lys Thr Pro Asn SerTyr Met Leu Gln Gln Phe 210 215 220 Asp Asn Pro Ala Asn Pro Gln Val HisTyr Glu Thr Thr Gly Pro Glu 225 230 235 240 Ile Trp Glu Asp Ser Lys GlyLys Val Asp Ile Phe Ile Gly Gly Ile 245 250 255 Gly Thr Gly Gly Thr IleSer Gly Ala Gly Arg Phe Leu Lys Glu Lys 260 265 270 Asn Pro Gly Ile LysVal Ile Gly Ile Glu Pro Ser Glu Ser Asn Ile 275 280 285 Leu Ser Gly GlyLys Pro Gly Pro His Lys Ile Gln Gly Ile Gly Ala 290 295 300 Gly Phe ValPro Arg Asn Leu Asp Ser Asp Ile Leu Asp Glu Val Ile 305 310 315 320 GluIle Ser Ser Asp Glu Ala Val Glu Thr Ala Lys Gln Leu Ala Val 325 330 335Gln Glu Gly Leu Leu Val Gly Ile Ser Ser Gly Ala Ala Ala Ala Ala 340 345350 Ala Ile Lys Val Ala Lys Arg Pro Glu Asn Ala Gly Lys Leu Ile Val 355360 365 Val Val Phe Pro Ser Phe Gly Glu Arg Tyr Leu Ser Ser Val Leu Tyr370 375 380 Gln Ser Ile Arg Glu Glu Cys Glu Asn Met Gln Pro Glu Pro 385390 395 63 1522 DNA Oryza sativa 63 gcacgaggtt ctaactacgg aactactcccctatccaaca cctccgagtc cgagcaacgc 60 aagatggcgt cgtggtcgtc gcccgtcgccgccgccgcct tgcaggtcca tttcgggtcc 120 tcctgcttct tctccgcccg atcgccacgacagaccctcc tcctaccacc tctcgcccgc 180 aaccctacac tgaccatcca gccccggccccatcccttcc ggaacatcaa ctcctcctcc 240 tcctccagct ggatgtgcca cgccgtcgccgccgaggtcg agggcctcaa catcgccgac 300 gacgtcaccc agctcatcgg caagactccaatggtatatc tcaacaacat cgtcaaggga 360 tgtgttgcca atgtcgctgc taagctcgagattatggagc cctgttgcag tgtcaaggac 420 aggataggat acagtatgat ttctgatgcggaagagaaag gcttgataac tcctggaaag 480 agtgttttgg tggaaccaac aagtggaaatacaggcattg gtcttgcctt cattgctgct 540 tccagaggat ataaattaat attgaccatgcctgcatcaa tgagcatgga gagaagagtt 600 ctactcaaag cttttggcgc tgaacttgtccttactgatg ccgcaaaagg gatgaagggg 660 gctgtagata aggctacaga gattttaaataagacacctg atgcctatat gctgcagcag 720 tttgacaacc ctgccaaccc aaaggtacattatgagacta ctgggccaga aatctgggag 780 gattctaaag ggaaggtgga tgtattcattggtggaattg gaacaggtgg aacaatatct 840 ggtgctggcc gtttcctgaa agagaaaaatcctggaatta aggttattgg tattgagcct 900 tctgagagta acatactctc tggtggaaaacctggcccac ataagattca aggcattggg 960 gcaggatttg ttccaaggaa cttggatagtgaagttctcg atgaagtgat tgagatatct 1020 agtgatgagg ctgttgagac agcaaagcaattggctcttc aggaaggatt actggttgga 1080 atttcatctg gggcagcagc agcagctgccattaaagttg caaaaagacc agaaaatgct 1140 ggaaagttgg tagtggttgt gtttccaagctttggtgaga ggtacctttc atctatcctt 1200 tttcagtcga taagagaaga atgtgagaagttgcaacctg aaccatgagc ctaacttcag 1260 tgttcacaac atcataattg tttctgagatttctggccat tagttttttt ttctgagaag 1320 tatcatacca ctccatagct gtttgttcgataaataaaac agttaccttt gcacttataa 1380 tgaggcttgt gagggtactg tgaaatttctctgaacatct tctactcttc tcttttatcc 1440 ttaaatcaat ctgggagcag tttgtaatacatacgtaaat ttaaagctgg gtgtttggta 1500 attgtaaaaa aaaaaaaaaa aa 1522 64415 PRT Oryza sativa 64 Ala Arg Gly Ser Asn Tyr Gly Thr Thr Pro Leu SerAsn Thr Ser Glu 1 5 10 15 Ser Glu Gln Arg Lys Met Ala Ser Trp Ser SerPro Val Ala Ala Ala 20 25 30 Ala Leu Gln Val His Phe Gly Ser Ser Cys PhePhe Ser Ala Arg Ser 35 40 45 Pro Arg Gln Thr Leu Leu Leu Pro Pro Leu AlaArg Asn Pro Thr Leu 50 55 60 Thr Ile Gln Pro Arg Pro His Pro Phe Arg AsnIle Asn Ser Ser Ser 65 70 75 80 Ser Ser Ser Trp Met Cys His Ala Val AlaAla Glu Val Glu Gly Leu 85 90 95 Asn Ile Ala Asp Asp Val Thr Gln Leu IleGly Lys Thr Pro Met Val 100 105 110 Tyr Leu Asn Asn Ile Val Lys Gly CysVal Ala Asn Val Ala Ala Lys 115 120 125 Leu Glu Ile Met Glu Pro Cys CysSer Val Lys Asp Arg Ile Gly Tyr 130 135 140 Ser Met Ile Ser Asp Ala GluGlu Lys Gly Leu Ile Thr Pro Gly Lys 145 150 155 160 Ser Val Leu Val GluPro Thr Ser Gly Asn Thr Gly Ile Gly Leu Ala 165 170 175 Phe Ile Ala AlaSer Arg Gly Tyr Lys Leu Ile Leu Thr Met Pro Ala 180 185 190 Ser Met SerMet Glu Arg Arg Val Leu Leu Lys Ala Phe Gly Ala Glu 195 200 205 Leu ValLeu Thr Asp Ala Ala Lys Gly Met Lys Gly Ala Val Asp Lys 210 215 220 AlaThr Glu Ile Leu Asn Lys Thr Pro Asp Ala Tyr Met Leu Gln Gln 225 230 235240 Phe Asp Asn Pro Ala Asn Pro Lys Val His Tyr Glu Thr Thr Gly Pro 245250 255 Glu Ile Trp Glu Asp Ser Lys Gly Lys Val Asp Val Phe Ile Gly Gly260 265 270 Ile Gly Thr Gly Gly Thr Ile Ser Gly Ala Gly Arg Phe Leu LysGlu 275 280 285 Lys Asn Pro Gly Ile Lys Val Ile Gly Ile Glu Pro Ser GluSer Asn 290 295 300 Ile Leu Ser Gly Gly Lys Pro Gly Pro His Lys Ile GlnGly Ile Gly 305 310 315 320 Ala Gly Phe Val Pro Arg Asn Leu Asp Ser GluVal Leu Asp Glu Val 325 330 335 Ile Glu Ile Ser Ser Asp Glu Ala Val GluThr Ala Lys Gln Leu Ala 340 345 350 Leu Gln Glu Gly Leu Leu Val Gly IleSer Ser Gly Ala Ala Ala Ala 355 360 365 Ala Ala Ile Lys Val Ala Lys ArgPro Glu Asn Ala Gly Lys Leu Val 370 375 380 Val Val Val Phe Pro Ser PheGly Glu Arg Tyr Leu Ser Ser Ile Leu 385 390 395 400 Phe Gln Ser Ile ArgGlu Glu Cys Glu Lys Leu Gln Pro Glu Pro 405 410 415 65 383 PRT Spinaciaoleracea 65 Met Ala Ser Leu Val Asn Asn Ala Tyr Ala Ala Ile Arg Thr SerLys 1 5 10 15 Leu Glu Leu Arg Glu Val Lys Asn Leu Ala Asn Phe Arg ValGly Pro 20 25 30 Pro Ser Ser Leu Ser Cys Asn Asn Phe Lys Lys Val Ser SerSer Pro 35 40 45 Ile Thr Cys Lys Ala Val Ser Leu Ser Pro Pro Ser Thr IleGlu Gly 50 55 60 Leu Asn Ile Ala Glu Asp Val Ser Gln Leu Ile Gly Lys ThrPro Met 65 70 75 80 Val Tyr Leu Asn Asn Val Ser Lys Gly Ser Val Ala AsnIle Ala Ala 85 90 95 Lys Leu Glu Ser Met Glu Pro Cys Cys Ser Val Lys AspArg Ile Gly 100 105 110 Tyr Ser Met Ile Asp Asp Ala Glu Gln Lys Gly ValIle Thr Pro Gly 115 120 125 Lys Thr Thr Leu Val Glu Pro Thr Ser Gly AsnThr Gly Ile Gly Leu 130 135 140 Ala Phe Ile Ala Ala Ala Arg Gly Tyr LysIle Thr Leu Thr Met Pro 145 150 155 160 Ala Ser Met Ser Met Glu Arg ArgVal Ile Leu Lys Ala Phe Gly Ala 165 170 175 Glu Leu Val Leu Thr Asp ProAla Lys Gly Met Lys Gly Ala Val Glu 180 185 190 Lys Ala Glu Glu Ile LeuLys Lys Thr Pro Asp Ser Tyr Met Leu Gln 195 200 205 Gln Phe Asp Asn ProAla Asn Pro Lys Ile His Tyr Glu Thr Thr Gly 210 215 220 Pro Glu Ile TrpGlu Asp Thr Lys Gly Lys Val Asp Ile Phe Val Ala 225 230 235 240 Gly IleGly Thr Gly Gly Thr Ile Ser Gly Val Gly Arg Tyr Leu Lys 245 250 255 GluArg Asn Pro Gly Val Gln Val Ile Gly Ile Glu Pro Thr Glu Ser 260 265 270Asn Ile Leu Ser Gly Gly Lys Pro Gly Pro His Lys Ile Gln Gly Leu 275 280285 Gly Ala Gly Phe Val Pro Ser Asn Leu Asp Leu Gly Val Met Asp Glu 290295 300 Val Ile Glu Val Ser Ser Glu Glu Ala Val Glu Met Ala Lys Gln Leu305 310 315 320 Ala Met Lys Glu Gly Leu Leu Val Gly Ile Ser Ser Gly AlaAla Ala 325 330 335 Ala Ala Ala Val Arg Ile Gly Lys Arg Pro Glu Asn AlaGly Lys Leu 340 345 350 Ile Ala Val Val Phe Pro Ser Phe Gly Glu Arg TyrLeu Ser Ser Ile 355 360 365 Leu Phe Gln Ser Ile Arg Glu Glu Cys Glu AsnMet Lys Pro Glu 370 375 380 66 386 PRT Solanum tuberosum 66 Met Ala SerPhe Ile Asn Asn Pro Leu Thr Ser Leu Cys Asn Thr Lys 1 5 10 15 Ser GluArg Asn Asn Leu Phe Lys Ile Ser Leu Tyr Glu Ala Gln Ser 20 25 30 Leu GlyPhe Ser Lys Leu Asn Gly Ser Arg Lys Val Ala Phe Pro Ser 35 40 45 Val ValCys Lys Ala Val Ser Val Pro Thr Lys Ser Ser Thr Glu Ile 50 55 60 Glu GlyLeu Asn Ile Ala Glu Asp Val Thr Gln Leu Ile Gly Asn Thr 65 70 75 80 ProMet Val Tyr Leu Asn Thr Ile Ala Lys Gly Cys Val Ala Asn Ile 85 90 95 AlaAla Lys Leu Glu Ile Met Glu Pro Cys Cys Ser Val Lys Asp Arg 100 105 110Ile Gly Phe Ser Met Ile Val Asp Ala Glu Glu Lys Gly Leu Ile Ser 115 120125 Pro Gly Lys Thr Val Leu Val Glu Pro Thr Ser Gly Asn Thr Gly Ile 130135 140 Gly Leu Ala Phe Ile Ala Ala Ser Arg Gly Tyr Lys Leu Ile Leu Thr145 150 155 160 Met Pro Ala Ser Met Ser Leu Glu Arg Arg Val Ile Leu LysAla Phe 165 170 175 Gly Ala Glu Leu Val Leu Thr Asp Pro Ala Lys Gly MetLys Gly Ala 180 185 190 Val Ser Lys Ala Glu Glu Ile Leu Asn Asn Thr ProAsp Ala Tyr Ile 195 200 205 Leu Gln Gln Phe Asp Asn Pro Ala Asn Pro LysIle His Tyr Glu Thr 210 215 220 Thr Gly Pro Glu Ile Trp Glu Asp Thr LysGly Lys Ile Asp Ile Leu 225 230 235 240 Val Ala Gly Ile Gly Thr Gly GlyThr Ile Thr Gly Thr Gly Arg Phe 245 250 255 Leu Lys Glu Gln Asn Pro AsnIle Lys Ile Ile Gly Val Glu Pro Thr 260 265 270 Glu Ser Asn Val Leu SerGly Gly Lys Pro Gly Pro His Lys Ile Gln 275 280 285 Gly Ile Gly Ala GlyPhe Ile Pro Gly Asn Leu Asp Gln Asp Val Met 290 295 300 Asp Glu Val IleGlu Ile Ser Ser Asp Glu Ala Val Glu Thr Ala Arg 305 310 315 320 Thr LeuAla Leu Gln Glu Gly Leu Leu Val Gly Ile Ser Ser Gly Ala 325 330 335 AlaAla Leu Ala Ala Ile Gln Val Gly Lys Arg Pro Glu Asn Ala Gly 340 345 350Lys Leu Ile Gly Val Val Phe Pro Ser Tyr Gly Glu Arg Tyr Leu Ser 355 360365 Ser Ile Leu Phe Gln Ser Ile Arg Glu Glu Cys Glu Lys Met Lys Pro 370375 380 Glu Leu 385 67 1581 DNA Zea mays 67 ggccgtggct tactggcttccacccacagc cttcgcactt ccctccttcc tcgcaaatgg 60 ccgtcgccgt ccccaacgctcccggccgcc tcttccttct ccaatccacc ccgttcccga 120 accctagcag ctcggcatccgccgctcgag cccaatcctt ccgcgtacca cccctccgcc 180 tctcgctatt ccgacgcatggctgggcgct cgctgacggt gatcgcaggc gcctccggcg 240 gctccgaacg agatctcagcgcctccgcag tctccgtgga ggccctggac tccgtcgcct 300 ccgattctga cttagagacgaaggagccca gtgtgtcgac gatgctgacg agcttcgaga 360 actcgttcga caagtatggggctctgagca caccgctgta ccagaccgcc acctttaagc 420 agccttcagc tacagattatggaacttatg attacactag aagtggtaac cctactcgtg 480 atgttctcca gagcctcatggctaagcttg agaaagcaga tcaagcattc tgcttcacca 540 gcgggatggc ggcgttagctgcagtaaaac acctccttca ggctggacaa gaaatagttg 600 ctggtgagga catatatggtggttctgatc gtctactctc gcaagttgtg ccaagaaatg 660 gaatagttgt aaaacgagtagatacaacga aaattagtga tgtggtgtct gcaattggac 720 cctccactag actggtttggctcgaaagtc ccacgaaccc tcgtcagcaa attactgaca 780 ttaagacaat ctcagagatagcgcattctc atggtgctct tgttttggtt gacaacagca 840 tcatgtctcc agtgctctcccgtcctatag aactgggagc tgatatcgtg atgcactcgg 900 ctaccaaatt tatagcgggacatagtgatc ttatggctgg aattcttgca gtgaagggtg 960 agagtttggc taaagaggtagggtttctgc aaaatgctga agggtcgggt ctggcacctt 1020 ttgactgctg gctttgcttgaggggaatca aaaccatggc tctgcgggtg gagaaacaac 1080 aggctaatgc ccagaagattgctgaattcc tggcgtctca cccgagggtc aagcaagtaa 1140 actacgctgg gcttcctgaccatcctgggc gagctttaca ctattcccag gcaaagggag 1200 cgggctctgt tctcagttttctcaccggct cactggccct ctcaaagcac gtcgtggaga 1260 ccaccaagta cttcagcgtaacagtcagct tcgggagcgt gaagtccctc atcagcctgc 1320 cgtgcttcat gtcccacgcatcaatccctg cctcggtccg cgaggagcgt ggcctaaccg 1380 acgacctcgt ccggatatcggtcggcatcg aggatgtcga ggacctcatc gccgatctgg 1440 accgcgcgct cagaactggcccggtgtaga catcgccgat ccttaggtca tgtcaagcta 1500 tcttttgatg attcattggttgactgcttg cgtgatgata ataatgggaa tgttgcttgg 1560 ataaaaaaaa aaaaaaaaaa a1581 68 470 PRT Zea mays 68 Met Ala Val Ala Val Pro Asn Ala Pro Gly ArgLeu Phe Leu Leu Gln 1 5 10 15 Ser Thr Pro Phe Pro Asn Pro Ser Ser SerAla Ser Ala Ala Arg Ala 20 25 30 Gln Ser Phe Arg Val Pro Pro Leu Arg LeuSer Leu Phe Arg Arg Met 35 40 45 Ala Gly Arg Ser Leu Thr Val Ile Ala GlyAla Ser Gly Gly Ser Glu 50 55 60 Arg Asp Leu Ser Ala Ser Ala Val Ser ValGlu Ala Leu Asp Ser Val 65 70 75 80 Ala Ser Asp Ser Asp Leu Glu Thr LysGlu Pro Ser Val Ser Thr Met 85 90 95 Leu Thr Ser Phe Glu Asn Ser Phe AspLys Tyr Gly Ala Leu Ser Thr 100 105 110 Pro Leu Tyr Gln Thr Ala Thr PheLys Gln Pro Ser Ala Thr Asp Tyr 115 120 125 Gly Thr Tyr Asp Tyr Thr ArgSer Gly Asn Pro Thr Arg Asp Val Leu 130 135 140 Gln Ser Leu Met Ala LysLeu Glu Lys Ala Asp Gln Ala Phe Cys Phe 145 150 155 160 Thr Ser Gly MetAla Ala Leu Ala Ala Val Lys His Leu Leu Gln Ala 165 170 175 Gly Gln GluIle Val Ala Gly Glu Asp Ile Tyr Gly Gly Ser Asp Arg 180 185 190 Leu LeuSer Gln Val Val Pro Arg Asn Gly Ile Val Val Lys Arg Val 195 200 205 AspThr Thr Lys Ile Ser Asp Val Val Ser Ala Ile Gly Pro Ser Thr 210 215 220Arg Leu Val Trp Leu Glu Ser Pro Thr Asn Pro Arg Gln Gln Ile Thr 225 230235 240 Asp Ile Lys Thr Ile Ser Glu Ile Ala His Ser His Gly Ala Leu Val245 250 255 Leu Val Asp Asn Ser Ile Met Ser Pro Val Leu Ser Arg Pro IleGlu 260 265 270 Leu Gly Ala Asp Ile Val Met His Ser Ala Thr Lys Phe IleAla Gly 275 280 285 His Ser Asp Leu Met Ala Gly Ile Leu Ala Val Lys GlyGlu Ser Leu 290 295 300 Ala Lys Glu Val Gly Phe Leu Gln Asn Ala Glu GlySer Gly Leu Ala 305 310 315 320 Pro Phe Asp Cys Trp Leu Cys Leu Arg GlyIle Lys Thr Met Ala Leu 325 330 335 Arg Val Glu Lys Gln Gln Ala Asn AlaGln Lys Ile Ala Glu Phe Leu 340 345 350 Ala Ser His Pro Arg Val Lys GlnVal Asn Tyr Ala Gly Leu Pro Asp 355 360 365 His Pro Gly Arg Ala Leu HisTyr Ser Gln Ala Lys Gly Ala Gly Ser 370 375 380 Val Leu Ser Phe Leu ThrGly Ser Leu Ala Leu Ser Lys His Val Val 385 390 395 400 Glu Thr Thr LysTyr Phe Ser Val Thr Val Ser Phe Gly Ser Val Lys 405 410 415 Ser Leu IleSer Leu Pro Cys Phe Met Ser His Ala Ser Ile Pro Ala 420 425 430 Ser ValArg Glu Glu Arg Gly Leu Thr Asp Asp Leu Val Arg Ile Ser 435 440 445 ValGly Ile Glu Asp Val Glu Asp Leu Ile Ala Asp Leu Asp Arg Ala 450 455 460Leu Arg Thr Gly Pro Val 465 470 69 1685 DNA Oryza sativa 69 aggcaaccatgagcgccgcc gccgccgccg ccgccgccgc cgcaatcccc acctctctcg 60 gccgcctcttccacctccgc cccaccccga acccctcccg gaaccttagc ggcagctcag 120 cgcaacccctcctccgcctc agctaccacc cacgcctcac gctctctcgc cgcatggagg 180 cgccggcggcgatcgccgac tcccacggcg gcggcgacct gagcgcgtcc gcggtcggcg 240 cggaggcgctgggcgccgtc gccgctccgg atttcgatgt ggagatgaag gagcctagcg 300 tggcgacgatactgacgagc ttcgagaact cgttcgatgg gttcgggtct atgagcacgc 360 cgctgtaccagacggccacg tttaagcagc cttcagcaac cgataatgga ccttatgatt 420 acactagaagtggtaaccct acacgtgatg ttctccaaag ccttatggct aagcttgaga 480 aggcggatcaggcattctgc ttcaccagtg ggatggcagc actagctgca gtaacacacc 540 tccttaagtctggacaagaa atagttgctg gagaggacat atatggtggc tcagaccgtc 600 tgctctcacaagttgccccg agacatggga ttgtagtaaa acgaattgat acaaccaaaa 660 ttagtgaggtaacttctgca attgggccct tgactaaact agtatggctt gaaagtccca 720 ccaatccccgtctacaaatt actgatataa agaaaatagc agagatagct cattaccatg 780 gtgctcttgttttagtagac aacagcatca tgtctcctgt gctctcccgt cctctagaac 840 ttggagcagatattgttatg cactcagcaa ccaaatttat agctggacat agcgatctta 900 tggctggaattcttgcggtg aagggtgaaa gcagcttggc taaagagatt gcatttctac 960 aaaatgctgaaggatcaggt ttggcaccat ttgattgctg gctttgtttg agaggaatca 1020 aaaccatggctttgcgggtg gagaagcagc aggctaatgc tcagaagatt gctgaatttc 1080 tagcttctcatccaagagta aagaaagtga actatgcagg acttcctgat catcctggac 1140 gatctctacactattcccag gcaaagggag cgggttcagt tctcagtttc ctaactggtt 1200 cattagctctctcaaaacat gttgttgaga ccacaaagta cttcaatgta acagttagct 1260 ttggaagtgtgaaatcgctc attagcctgc catgcttcat gtcacacgcc agcatccctt 1320 ctgcggttcgcgaggagcgc ggcctgacag acgatctagt caggatatcg gttggaattg 1380 aggatgccgacgacctcata gcggatcttg atcatgctct ccggtctggt ccagcttaga 1440 gcctgtgaattctgtgccct tcctgttcgt tagggatgta gatgtggtca tgtgggtgct 1500 atctgtgtgggtgattgatt cattggtcaa ctcaataagc tgctgtgtca tcgagggaat 1560 aaagacaatctatcccaaat tttttaacac catatggtga ccaactgacc atgatatggt 1620 cttaatcaattgatatttat agaaggtttc tttgaactgc aaaaaaaaaa aaaaaaaaaa 1680 aaaaa 168570 476 PRT Oryza sativa 70 Met Ser Ala Ala Ala Ala Ala Ala Ala Ala AlaAla Ile Pro Thr Ser 1 5 10 15 Leu Gly Arg Leu Phe His Leu Arg Pro ThrPro Asn Pro Ser Arg Asn 20 25 30 Leu Ser Gly Ser Ser Ala Gln Pro Leu LeuArg Leu Ser Tyr His Pro 35 40 45 Arg Leu Thr Leu Ser Arg Arg Met Glu AlaPro Ala Ala Ile Ala Asp 50 55 60 Ser His Gly Gly Gly Asp Leu Ser Ala SerAla Val Gly Ala Glu Ala 65 70 75 80 Leu Gly Ala Val Ala Ala Pro Asp PheAsp Val Glu Met Lys Glu Pro 85 90 95 Ser Val Ala Thr Ile Leu Thr Ser PheGlu Asn Ser Phe Asp Gly Phe 100 105 110 Gly Ser Met Ser Thr Pro Leu TyrGln Thr Ala Thr Phe Lys Gln Pro 115 120 125 Ser Ala Thr Asp Asn Gly ProTyr Asp Tyr Thr Arg Ser Gly Asn Pro 130 135 140 Thr Arg Asp Val Leu GlnSer Leu Met Ala Lys Leu Glu Lys Ala Asp 145 150 155 160 Gln Ala Phe CysPhe Thr Ser Gly Met Ala Ala Leu Ala Ala Val Thr 165 170 175 His Leu LeuLys Ser Gly Gln Glu Ile Val Ala Gly Glu Asp Ile Tyr 180 185 190 Gly GlySer Asp Arg Leu Leu Ser Gln Val Ala Pro Arg His Gly Ile 195 200 205 ValVal Lys Arg Ile Asp Thr Thr Lys Ile Ser Glu Val Thr Ser Ala 210 215 220Ile Gly Pro Leu Thr Lys Leu Val Trp Leu Glu Ser Pro Thr Asn Pro 225 230235 240 Arg Leu Gln Ile Thr Asp Ile Lys Lys Ile Ala Glu Ile Ala His Tyr245 250 255 His Gly Ala Leu Val Leu Val Asp Asn Ser Ile Met Ser Pro ValLeu 260 265 270 Ser Arg Pro Leu Glu Leu Gly Ala Asp Ile Val Met His SerAla Thr 275 280 285 Lys Phe Ile Ala Gly His Ser Asp Leu Met Ala Gly IleLeu Ala Val 290 295 300 Lys Gly Glu Ser Ser Leu Ala Lys Glu Ile Ala PheLeu Gln Asn Ala 305 310 315 320 Glu Gly Ser Gly Leu Ala Pro Phe Asp CysTrp Leu Cys Leu Arg Gly 325 330 335 Ile Lys Thr Met Ala Leu Arg Val GluLys Gln Gln Ala Asn Ala Gln 340 345 350 Lys Ile Ala Glu Phe Leu Ala SerHis Pro Arg Val Lys Lys Val Asn 355 360 365 Tyr Ala Gly Leu Pro Asp HisPro Gly Arg Ser Leu His Tyr Ser Gln 370 375 380 Ala Lys Gly Ala Gly SerVal Leu Ser Phe Leu Thr Gly Ser Leu Ala 385 390 395 400 Leu Ser Lys HisVal Val Glu Thr Thr Lys Tyr Phe Asn Val Thr Val 405 410 415 Ser Phe GlySer Val Lys Ser Leu Ile Ser Leu Pro Cys Phe Met Ser 420 425 430 His AlaSer Ile Pro Ser Ala Val Arg Glu Glu Arg Gly Leu Thr Asp 435 440 445 AspLeu Val Arg Ile Ser Val Gly Ile Glu Asp Ala Asp Asp Leu Ile 450 455 460Ala Asp Leu Asp His Ala Leu Arg Ser Gly Pro Ala 465 470 475 71 1699 DNATriticum aestivum 71 gcacgagagc gtggccacga tactgaccag cttcgagaactcgttcgaca agtatggggc 60 tctcagcacg ccgctgtacc agacggccac cttcaagcagccttcagcaa ccgttaatgg 120 agcttatgat tatactagaa gtggcaaccc tactcgtgatgttctccaga gccttatggc 180 taagctcgag aaggcagacc aagcattctg cttcactagtgggatggcat cactggctgc 240 agtaacacac ctccttcagg ctggacaaga aatagttgctggagaggaca tatatggtgg 300 ctctgatcgt ctgctctcac aagttgtccc aagaaatggaattgtagtaa aacgggtcga 360 tacaactaaa attaacgacg tgactgctgc aatcggacccttgactagac tagtttggct 420 tgaaagtccc accaatcctc gtcaacaaat tactgatataaagaaaatct cagagatagc 480 tcattctcat ggtgcacttg ttttggtgga caacagtatcatgtctccag tgctatcctg 540 gcctatagaa cttggagcag atattgtgat gcactcagctaccaaattta tagctggaca 600 cagtgatctt atggctggaa ttcttgctgt aaagggtgaaagcttggcta aggagattgc 660 atttctacaa aacgctgaag gttctggttt ggcaccttttgattgttggc tttgcttgag 720 agggatcaaa accatggcct tacgggtgga aaagcaacaggataatgccc agaagattgc 780 tgaattctta gcttctcatc caagggtcaa gcaagtgaattatgctggac ttcctgatca 840 tcctggccga tctttacact actctcaggc aaagggagcgggctctgtcc tcagtttcca 900 aactggttca ttgtctctct caaagcatgt tgttgagacaaccaagtact tcaacgtaac 960 agttagcttc ggaagtgtga agtcactcat aagcttgccctgcttcatgt cgcacgcgag 1020 catcccttcc tcggtgcgag aggagcgtgg gttgactgatgatctagtac ggatatcggt 1080 gggtattgag gatgtggatg acctcatagc tgatcttgattacgcgctca ggtccggtcc 1140 agcatagatc atacaaaatc tggactatgg cgcttcgggttctagttaat caagttgtag 1200 atgtgatatg cattggtgat tcatttgtta agctgcaacagtaataataa acttctgcac 1260 gagtattttc tgaaatgacg agcccacggt tgtatgtgttgttcctcata ggcttcaaca 1320 gaaaaaccct gaggccaact gacaagtagc aacattcataaacttcacaa catcgatact 1380 tggttctgcc catgttcatt tttcttggct gccattgtgacggctttgta gctcaagtag 1440 gaaggagtga catggccgtt ggttgatggg gagaaaaggagttggttcgt cggatcgatc 1500 cgtgtaggcg cttgtgtatt ttgtatatgg tgtttttcgtctgtgcaggt gagtctgtgt 1560 atacatctgg agactggatt attcatggtc attggtgtggcggtgaagaa taatgtgacg 1620 attcttttgt agtgtatcta agaactgtga tgttcttgtgcaaaaaaaaa aaaaaaaaaa 1680 aaaaaaaaaa aaaaaaaaa 1699 72 381 PRT Triticumaestivum 72 His Glu Ser Val Ala Thr Ile Leu Thr Ser Phe Glu Asn Ser PheAsp 1 5 10 15 Lys Tyr Gly Ala Leu Ser Thr Pro Leu Tyr Gln Thr Ala ThrPhe Lys 20 25 30 Gln Pro Ser Ala Thr Val Asn Gly Ala Tyr Asp Tyr Thr ArgSer Gly 35 40 45 Asn Pro Thr Arg Asp Val Leu Gln Ser Leu Met Ala Lys LeuGlu Lys 50 55 60 Ala Asp Gln Ala Phe Cys Phe Thr Ser Gly Met Ala Ser LeuAla Ala 65 70 75 80 Val Thr His Leu Leu Gln Ala Gly Gln Glu Ile Val AlaGly Glu Asp 85 90 95 Ile Tyr Gly Gly Ser Asp Arg Leu Leu Ser Gln Val ValPro Arg Asn 100 105 110 Gly Ile Val Val Lys Arg Val Asp Thr Thr Lys IleAsn Asp Val Thr 115 120 125 Ala Ala Ile Gly Pro Leu Thr Arg Leu Val TrpLeu Glu Ser Pro Thr 130 135 140 Asn Pro Arg Gln Gln Ile Thr Asp Ile LysLys Ile Ser Glu Ile Ala 145 150 155 160 His Ser His Gly Ala Leu Val LeuVal Asp Asn Ser Ile Met Ser Pro 165 170 175 Val Leu Ser Trp Pro Ile GluLeu Gly Ala Asp Ile Val Met His Ser 180 185 190 Ala Thr Lys Phe Ile AlaGly His Ser Asp Leu Met Ala Gly Ile Leu 195 200 205 Ala Val Lys Gly GluSer Leu Ala Lys Glu Ile Ala Phe Leu Gln Asn 210 215 220 Ala Glu Gly SerGly Leu Ala Pro Phe Asp Cys Trp Leu Cys Leu Arg 225 230 235 240 Gly IleLys Thr Met Ala Leu Arg Val Glu Lys Gln Gln Asp Asn Ala 245 250 255 GlnLys Ile Ala Glu Phe Leu Ala Ser His Pro Arg Val Lys Gln Val 260 265 270Asn Tyr Ala Gly Leu Pro Asp His Pro Gly Arg Ser Leu His Tyr Ser 275 280285 Gln Ala Lys Gly Ala Gly Ser Val Leu Ser Phe Gln Thr Gly Ser Leu 290295 300 Ser Leu Ser Lys His Val Val Glu Thr Thr Lys Tyr Phe Asn Val Thr305 310 315 320 Val Ser Phe Gly Ser Val Lys Ser Leu Ile Ser Leu Pro CysPhe Met 325 330 335 Ser His Ala Ser Ile Pro Ser Ser Val Arg Glu Glu ArgGly Leu Thr 340 345 350 Asp Asp Leu Val Arg Ile Ser Val Gly Ile Glu AspVal Asp Asp Leu 355 360 365 Ile Ala Asp Leu Asp Tyr Ala Leu Arg Ser GlyPro Ala 370 375 380

What is claimed is:
 1. An isolated polynucleotide that encodes a plantcysteine γ synthase having amino acid sequence identity of at least 95%based on the Clustal method of alignment when compared to a polypeptideselected from the group consisting of SEQ ID NOs:31, 62, and
 64. 2. Thepolynucleotide of claim 1 wherein the polynucleotide encodes apolypeptide selected from the group consisting of SEQ ID NOs: NOs:31,62, and
 64. 3. The polynucleotide of claim 1, wherein the polynucleotidecomprises a nucleotide sequence selected from the group consisting ofSEQ ID NO:30, 61, and
 63. 4. An isolated complement of thepolynucleotide of claim 1, wherein (a) the complement and thepolynucleotide consist of the same number of nucleotides, and (b) thenucleotide sequences of the complement and the polynucleotide have 100%complementarity.
 5. An isolated nucleic acid molecule that (1) comprisesat least 180 nucleotides (2) remains hybridized with a polynucleotidehaving a nucleotide sequence selected from the group consisting of SEQID NO:30, 61, and 63 under a wash condition of 0.1×SSC, 0.1% SDS, and65° C., and encodes a plant cysteine γ synthase.
 6. A cell comprisingthe polynucleotide of claim
 1. 7. The cell of claim 6, wherein the cellis selected from the group consisting of a yeast cell, a bacterial celland a plant cell.
 8. A transgenic plant comprising the polynucleotide ofclaim
 1. 9. A method for transforming a cell comprising introducing intoa cell the polynucleotide of claim
 1. 10. A method for producing atransgenic plant comprising (a) transforming a plant cell with thepolynucleotide of claim 1, and (b) regenerating a plant from thetransformed plant cell.
 11. A method for producing a polynucleotidefragment comprising (a) selecting a nucleotide sequence comprised by thepolynucleotide of claim 1, and (b) synthesizing a polynucleotidefragment containing the nucleotide sequence.
 12. The method of claim 11,wherein the fragment is produced in vivo.
 13. A chimeric gene comprisingthe polynucleotide of claim 1 operably linked to at least one regulatorysequence.
 14. A method for altering the level of cysteine γ synthaseexpression in a host cell, the method comprising: (a) Transforming ahost cell with the chimeric gene of claim 13; and (b) growing thetransformed cell from step (a) under conditions suitable for theexpression of the chimeric gene.
 15. A method for evaluating a compoundfor its ability to inhibit the activity of a plant cysteine γ synthase,the method comprising the steps of: (a) transforming a host cell with achimeric gene comprising a polynucleotide of claim 1, operably linked toat least one regulatory sequence; (b) growing the transformed host cellunder conditions that are suitable for expression of the chimeric genewherein expression of the chimeric gene results in production of theplant biosynthetic enzyme encoded by the operably linked nucleic acidfragment in the transformed host cell; (c) optionally purifying theplant biosynthetic enzyme polypeptide expressed by the transformed hostcell; (d) treating the plant biosynthetic enzyme with a compound to betested; (e) comparing the activity of the plant biosynthetic enzyme thathas been treated with a test compound to the activity of an untreatedplant biosynthetic enzyme polypeptide; and (f) selecting the compoundthat inhibits the activity of cysteine γ synthase.